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I.  Introduction  and  Goals 

Computer  vision  is  finding  increased  application  in  robotics 
and  for  shipbuilding,  construction,  machining,  sorting,  and 
inspection.  An  essential  part  of  a  computer  vision  system  is  the 
ability  to  measure  range  (distance)  to  points  on  the  surfaces  of 
three-dimensional  objects.  This  information  allows  the  des¬ 
cription  of  such  intrinsic  characteristics  of  objects  as  orien¬ 
tation,  surface  condition,  and  dimensions. 

A  goal  of  the  work  described  here  was  the  development, 
testing,  and  evaluation  of  a  real-time  system  for  measuring 
range,  suitable  for  the  kinds  of  applications  described  above.  A 
second  goal  was  the  development  of  algorithms  for  description  of 
objects,  based  on  their  range-image  representations. 


In  collaboration  with  the  National  Bureau  of  Standards 
(NBS),  George  Washington  University  (GWU)  has  shown  the  feasi¬ 
bility,  and  measured  the  performance,  of  a  prototype  ranging 
system  that  uses  a  technique  called  structured  light;  GWU  has 
also  derived  and  implemented  algorithms  for  rapid  description  of 
a  limited  class  of  objects  using  only  the  data  that  would  be 
supplied  by  such  a  system. 

II.  Background,  approach,  and  role  of  computer  vision 

A.  Role  and  competence  of  structured-light  systems 

Machine  vision  attempts  to  extract  information  about  the  objects 
in  a  scene  from  the  manner  in  which  the  objects  modulate  and 
reflect  the  illumination  falling  on  them.  In  most  cases,  normal 
ambient  light  sources  provide  minimal  constraints  which  can  be 
employed  to  simplify  this  task.  Structured  light  techniques  for 
machine  vision  employ  illumination  sources  which  provide  more 
direct  information  than  is  available  from  normal  ambient  light. 
Nonetheless,  many  approaches  to  machine  vision  have  attempted  to 
solve  the  general  vision  and  image  understanding  problem  without 
recourse  to  artificially  structured  illumination.  While  this 
presents  many  interesting  problems  from  an  academic  standpoint, 
and  is  required  in  those  special  applications  where  ambient 
illumination  must  be  employed,  the  speedy  development  of  practi- 


cal  applications  in  automated  assembly  and  manufacturing  could 
profit  from  far  greater  exploitation  of  structured-light  ap¬ 
proaches  . 

In  the  majority  of  near-term  applications,  such  as  ship¬ 
building,  construction,  machining,  welding,  assembly,  inspection, 
maintenance,  painting,  sorting,  and  materials  handling,  there  is 
no  obstacle  to  the  use  of  artificially  structured  illumination. 
This  approach  facilitates  precision  optical  ranging  and  measure¬ 
ment,  and  greatly  simplifies  many  of  the  traditional  problems  of 
image  processing  and  understanding.  Moreover,  projectors  for 
structured  illumination  may  be  simple  and  compact,  and  may  be 
mounted  on  the  manipulator  along  with  the  camera. 

One  of  the  greatest  advantages  of  the  structured  light 
approach  is  the  ability  to  simplify  the  determination  of  depth 
in  the  image.  This  arises  from  the  possibility  of  knowing  the 
exact  camera-relative  angle  of  origin  of  the  ray  illuminating  any 
part  of  the  scene.  Determination  of  depth  then  reduces  to  a 
straightforward  process  of  triangulation.  In  simple  applica¬ 
tions,  a  single  plane  of  light  may  be  projected  from  a  mani¬ 
pulator,  and  it3  intersection  with  objects  in  the  field  observed 
by  an  offset  camera.  Experiments  at  NBS  have  shown  this  arrange¬ 
ment  to  be  effective  for  the  real-time  control  of  reflexive 
seek-and-grasp  operations,  and  even  for  simple  object-discrimi¬ 


nation  tasks. 


Additional  planes  of  projected  light  in  such  a  system  enable 
the  direct  sensing  of  three-dimensional  tilt  of  surfaces.  When 
combined  with  two-dimensional  outline  images  obtained  from 
point-source  illumination,  such  a  system  is 
adequate  for  industrial  parts  manipulation.  With  minor  modifi¬ 
cations  (projection  onto  a  surface  and  examination  with  a 
telescope),  the  system  will  also  work  well  for  automated  inspec¬ 
tion. 

The  work  described  here  has  made  it  possible  to  read  the 
Z-axis  coordinate  (depth)  of  any  pixel  in  the  scene  directly,  and 
thus  to  define  any  point  of  an  imaged  object  immediately  in  terms 
of  its  coordinates  in  three-dimensional  visual  space.  This 
ability  is  central  to  control  of  a  manipulator  that  must  interact 
with  random  objects  in  real  space.  However,  its  most  important 
application  is  in  image  understanding,  where  three-dimensional 
information  is  invaluable  in  resolving  ambiguities  of  projected 
shape,  occlusion,  relative  size,  orientation,  segmentation,  and 
the  like.  So  fundamental  is  this  information,  that  current 
non-structured  (ambient)  light  techniques  devote  the  majority  of 
their  processing  time  to  obtaining  three-dimensional  descrip¬ 
tions  from  ambient  light  cues,  or  to  classifying  objects  on  the 
basis  of  two-dimensional  cues  generated  by  surface  interactions 
in  three  dimensions.  As  a  result  of  the  processing  bottleneck 
represented  bv  the  difficulty  of  obtaining  three-dimensional 
information,  there  are  currently  no  ambient-light  vision  tech- 
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Lack  of  adequate  interpreted  sensory  information  is  perhaps 
the  greatest  single  obstacle  to  the  commercial  exploitation  of 
robotics.  While  a  full  solution  to  the  ambient-light  vision 
problem  appears  to  be  many  years  off,  structured-light  approaches 
offer  a  realizable  near-term  solution  that  is  applicable  in  a 
large  variety  of  important  military  and  industrial  settings.  It 
seems  likely  that  any  program  which  could  promote  the  develop¬ 
ment  of  commercially  available  structured-light  vision  systems 
would  have  very  favorable  consequences  for  the  field  of  automated 
manufacturing  and  in  turn  for  its  military  and  industrial 
applications.  In  the  next  section,  we  outline  such  a  program. 

To  be  useful  in  practical  applications,  a  vision  system  must 
operate  in  real  time.  Essentially,  this  means  that  it  must  be 
able  to  accept  data  as  fast  as  it  can  be  provided,  and  that  it 
must  complete  analysis  of  that  data  to  any  given  level  of 
complexity  as  rapidly  as  analyses  at  that  level  are  required  for 
uninterrupted  operation.  Many  of  the  algorithms  currently  in  use 
for  the  simpler  forms  of  structured-light  vision  are  within  the 
real-time  capacity  of  standard  computer  hardware.  Others  run  in 
"near"  real-time,  while  still  others  are  undergoing  development. 
In  all  cases,  order s-of-magnitude  improvement  could  be  expected 
from  application  of  dedicated  specialized  hardware,  an  example  of 
which  was  built  and  tested  as  part  of  this  work. 
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niques  suitably  developed  for  commercial  real-time  applications  | 

except  in  highly  constrained  environments  (such  as  flat  parts  on 

a  conveyor)  where  three-dimensionality  may  be  ignored  safely.  I 

Many  techniques  have  been  proposed  for  obtaining  three-dimen¬ 
sional  information  from  ambient  light  cues.  Among  them  may  be  | 

mentioned  stereopsis,  optic  flow,  shape  from  shading,  and  shape 
from  texture.  All  of  these  are  areas  of  intense  current  interest 

and  activity,  and  many  are  modeled  on  psychophysical  analogs.  | 

However,  a  practical  solution  to  the  very  complex  problem  of 
three-dimensional  ambient  light  vision  may  well  require  a 

synthesis  of  many  of  these  approaches,  and  all  of  them  indivi-  | 

dually  are  at  best  in  an  experimental  state  of  development. 

Moreover,  all  of  these  approaches  are  intensely  computational  in 

nature,  and  will  require  improvements  of  many  orders  of  magnitude  | 

in  computing  speed  to  be  competitive  with  structured  light 
techniques  that  have  been  demonstrated  in  real-time  vision.  They 

may  be  adapted  in  graded  levels  of  complexity  according  to  the  | 

task,  from  simple  ranging  and  parts  acquisition  to  complex  image 
understanding.  In  many  instances  these  techniques  are  ready  to 

be  moved  out  of  the  laboratory  and  into  practical  development;  in  I 

other  cases  the  time  to  practical  development  appears  signifi¬ 
cantly  shorter  than  that  of  any  of  the  ambient  light  techniques. 

Many  structured  light  applications  are  within  the  real-time  : 

computational  power  of  standard  computer  hardware;  others  appear 
to  be  very  close  to  this  goal.  This  report  describes  several 
approaches. 


B.  SPECIFIC  APPROACH 


B.l  The  Basic  Algorithm 


The  basic  algorithm  of  our  system  employs  triangulation. 
Figure  1  illustrates  distances  R^  of  points  Xj  of  object  A  from  a 
reference  plane  Q,  perpendicular  to  the  plane  of  the  paper  and 
including  points  C  and  P.  It  is  possible  to  reconstruct  a  3-D 
representation  of  object  A  with  respect  to  a  viewer  in  plane  Q. 

The  following  relationship  may  be  derived  from  the  geometry 
shown  in  Figure  2: 


( cot  a .  +  cot  8 . ) 
l  l 


where  D  is  the  distance  between  points  C  and  P,  and  is  called 
"range"  as  defined  earlier. 

Therefore  in  order  to  identify  a  specific  point  and  hence 
calculate  its  range  we  need  to  measure  or  calculate  and  Bj  and 
their  corresponding  cotangents. 

Suppose  we  have  a  template  M  such  as  the  one  in  Figure  3, 
and  place  it  in  front  of  reference  point  P  at  a  fixed  distance 


As  shown  in  Figure  4  we  can  now  decide  the  partition  j 
through  which  the  line  (or  vector)  connecting  reference  point  P 
to  any  point  Xj  passes,  which  in  turn  will  give  us  the  value  of 
angle  ctj,  hence  cot  a^: 


cot  a.  • 


where  a  is  the  height  of  each  partition. 

In  the  design  and  construction  of  this  vision  system,  C  is 
the  focal  point  of  a  charge-injection  device  (CID)  camera 
(General  Electric  Model  TN  2500)  with  a  244  x  248-pixel  resolu¬ 
tion  plane  at  focal  distance  f2  from  C,  the  reference  point  P 
represents  a  high-intensity  line  source  perpendicular  to  the 
plane  of  paper,  and  template  M  is  made  of  glass  so  that  the  light 
from  source  P  can  pass  through  and  illuminate  object  A  (see  Fig. 
5) . 

It  is  very  time  consuming  if  we  use  a  template  such  as  that 
described  above  for  measuring  cot  of  all  points  of  the  object 
A:  the  above  design  would  fail  to  work  in  real  time.  Therefore, 
instead  of  template  M  we  use  a  series  of  coded  glass  masks. 

Figure  6  shows  a  reference  mask  "0"  and  eight  masks  numbered 
1  through  8  which  can  be  used  for  a  ( 2®  «  256)  x  256  image. 

In  the  following  example  we  will  show  how  such  masks  are 
used  in  finding  cot  a.  For  simplicity,  however,  we  will  use  a 
4x4  image  which  implies  log2  4  ■  2  masks.  Note  that  in  this 
example  black  has  the  highest  gray  value,  "1",  and  white  the 
lowest,  "0".  Also,  the  following  criterion  is  exploited  in 
comparing  two  images  (one  is  called  "reference"  and  the  other 
"input"  image):  if  the  gray  level  of  a  pixel  of  an  input  image  is 
equal  to  or  less  than  the  corresponding  pixel  in  the  reference 
image,  a  zero  is  stored  in  a  memory  corresponding  to  that  pixel. 


Referring  co  Figure  7a,  we  note  that  the  projector  P  illumi¬ 
nates  the  whole  object  A  and  the  absolute  gray  values  of  various 
points  of  object  A  (e.g.  xj ,  X2»  X3,  X4)  are  stored  in  a  memory 
plane  called  the  reference  memory  after  being  imaged  by  the 
camera.  This  array  is  called  the  reference  image. 

In  Figure  7b  the  first  mask  is  placed  in  front  of  projector 
P.  Since  half  of  mask  "1"  is  opaque,  points  X3  and  X4  would  not 
be  illuminated;  therefore,  in  comparing  the  resultant  image  with 
the  reference  image,  the  algorithm  would  give  points  xj  and  X2 
the  same  gray  level,  while  X3  and  X4  have  a  higher  gray  value. 
Thus  the  values  "0"  for  former  points  and  "1"  for  later  points, 
respectively,  will  be  stored  in  a  memory  called  "Bit  plane  memory 
1".  Note  that  the  values  stored  in  bit  plane  memories  are  rela¬ 
tive  single-bit  values  for  each  pixel. 

If  we  proceed  in  the  same  manner  with  mask  "2"  we  will  have 
results  such  as  the  ones  in  bit  plane  memorv-2  of  Figure  7c. 

In  the  next  step  the  corresponding  values  of  comparison  for 
each  pixel  will  be  collected,  in  sequential  order,  as  a  code  (for 
our  example  with  4x4  image  we  have  2  bit  planes  and  hence  2-bit 
code  for  each  pixel  -  see  Figure  7d).  Converting  this  code  to 
decimal  will  tell  us  which  partition  j  illuminated  a  point  Xj, 
and  from  the  previous  discussion  we  can  find  cot  a. 

Note  that,  depending  on  the  type  of  image  we  have  (e.g.  4x4, 
8x8 , . . . 2nx2n ) ,  we  need  2,3,...,  n  masks  in  order  to  have  an  n-bit 
code  that  defines  every  point  of  an  image. 


CONSTRUCTION  OF  THE  BIT  PLANE: 
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01 

10 

00 

00 

Reference  Memory 
(Actual  Gray  Level) 


THE  REFERENCE  IMAGE,  TO  ACCOUNT  FOR  INCIDENT  INTENSITY 

AND  OBJECT  REFLECTIVITY 


0 

0 

_L 

1 

Mask  "1" 


Bit  plane  memory  ?'/l 
(relative  gray  level  ) 


Fig.  7(b) 

BIT-PLANE  LEVEL  ONE;  RESULT  OF  TWO-BAR  MASK 


mask  "2" 


Bit  plane  memory  #2 


Fig.  7(c) 

BIT  PLANE  LEVEL  TWO;  RESULT  OF  FOUR-BAR  MASK 
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COLLECTING  MASKS'  RESULTS  INTO  A  SINGLE  CODE 


B.2  'Visible'  field  of  View 


By  'visible'  field  of  view  or  'visible'  range  we  mean  the 
area  which  is  both  illuminated  by  the  projector  and  within  the 
optical  field  of  the  camera.  In  this  section  we  define  the  boun¬ 
daries  of  this  field  of  view  and  find  the  resolution  limits  for 
the  CID  camera. 

From  Fig.  9  we  can  write  the  equations  for  lines  A,  B, 
C  and  D  which  constitute  the  contour  of  the  visible  field. 


A:  y^  ■  <d-x^)  •  cot  26 


B:  xB  -  0 
-V  *  XC 


D: 


d 


Note:  P  is  at 

x-0,  y  ■  -fj, 

C  is  at  x«d,  y*0, 

and  6  is  the  half-angle  of 
the  f ield-of-view  of  the 
camera 


The  intersection  of  lines  A  and  C  gives  the  x-y  coordinate  (Y 
is  actually  the  range)  of  the  closest  point  0  to  the  reference 
plane  which  is  visible. 

d  cot  20  +  f. 


'I 


—  ♦  cot  20 

m 

cot  26  (d-m) 


1  + 


m 


cot  20 


'I 


Intersection  of  C  and  D,  A  and  B  will  yield  the  locations 

of  the  other  two  closest  visible  points,  one  on  the  projector 

horizontal  axis  (Q)  and  the  other  on  the  horizontal  axis  and  at 
the  same  height  as  the  camera  (E). 


22 


That  is, 


xQ  «  0 


Vq  *  d  cot  20 


xE  -  +d 


vE  -  d 


■1 


-  f. 


m  1 

So  the  following  result  expresses  the  visibility  of  an  object  or 
a  point  on  the  object. 

f. 


*  +  *  *  "  fr  for  V  x  -  xo 


R  »  y 
x  J  x 


>  (d  -  x)  cot  20, 


xQ  k  x  ^  0 


Invisible,  otherwise 

Therefore  for  a  point  of  an  object  to  be  in  the  'visible'  field 

of  view  its  range  must  be  greater  than  or  equal  to  f  ( - 1) 

i  n 

Xq£  x  <  Xg,  or  greater  than  or  equal  to  (d-x)cot  20  if  0  <  x  Sxq, 

where  all  the  parameters  are  as  defined  previously. 

Now  we  want  to  find  the  relationship  between  the  resolution 
and  the  range:  suppose  from  a  point  (x},  r ^ )  we  want  to  view 
another  point  (x2,  r2)  where: 
x2  *  Xj  ♦  Ax 


r2  "  fl 


r,  ♦  Ar 


6,  ■  tan 


Then, 
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-1  1 

2  d-(x^+  Ax) 

and 

o  -  90  -  (0  +  Bj) 

a2  -  90  -  (0  +  B2) 

Now  the  resulting  images  will  be: 

i  t  n  r  n  ,  i 

1  “  2~  1  l  ”  cot  0j*  tan  oi^  J 

(if  B  is  more  than  (90  -  0)  the 
plus  sign  must  be  used) 

2'  ■  j-  :  [  cot  ©2*  tan  J 

and  the  difference  (or  displacement)  of  the  image  by  moving  from 
point  1  to  point  2  is: 


AR’  -  |  1 ' -2 '  | 

Therefore  to  be  able  to  distinguish  between  different 
points  we  require 

^X@1  £  AR'  <  n  -  (min  (1*  and  2’)) 

blzc 

In  order  to  be  able  to  resolve  a  point  as  a  function  of  its 
previous  location  (range  and  height)  its  image  must  differ  by 
value  greater  than  the  pixel  size  of  the  camera  (otherwise  both 
points  would  be  resolved  as  one)  and  less  than  n  (the  width  of 
CID  plane)  minus  minimum  of  the  image  value  of  either  points  1 


and  2  (namely  1'  and  2’).  (See  Appendices  A  and  B  for  tabulation 
and  display  of  typical  values  of  visible  ranges  and  precision  of 


range  estimates.) 


B.3  Camera-Projector  Set  Up 


Io  calculate  the  range  it  ia  necessary  to  arrange  the 
apparatus  as  illustrated  in  Figure  10.  Since  cq  or  cot  oq  is  to 
be  decoded  using  the  masks,  the  transparency  of  the  stripes  in 
the  masks  must  vary  in  the  Z  direction.  If,  however,  the 
set-up  is  like  that  of  Figure  11,  the  variation  in  the  masks  is 
in  the  Y  direction.  That  results  in  decoding  cq  or  cot  oq,  which 
do  not  contain  any  range  information. 

B.4  Light  Source  and  Slit 

An  important  consideration  in  this  system  is  the  necessity 
of  having  a  fine  light  source  with  the  narrowest  possible  slit 
while  providing  the  required  light  intensity.  The  need  for  a 
narrow  slit  is  illustrated  in  Fig.  12. 

As  can  be  seen  from  Figure  12a  the  shaded  areas  are  neither 
completely  illuminated  nor  completely  dark.  Therefore,  in 
comparison  with  the  reference  memory  these  areas  will  result  in 
wrong  encodings.  On  the  other  hand.  Figure  12b  shows  how  a 
perfect  (but  non-realizable)  line  slit  will  result  in  just  two 
distinct  areas:  illuminated  and  dark. 

Therefore  one  must  approximate  a  perfect  slit.  However, 
with  the  light  source  that  we  had  available  ( Sunpak  433D)  it  was 
not  possible  to  provide  enough  light  through  the  slit;  the 
reduced  intensity  resulted  in  degradation  of  images  and  hence  of 


the  estimates  of  range. 


CORRECT  ORIENTATION  OF  APPARATUS  FOR  MEASURING  RANGE 


AN  INCORRECT  ORIENTATION;  <*9  WILL  BE  MEASURED 


In  this  cose  the  mask  axis  and  the 
nrojector-camera  axis  are  parallel 


wide  slit 
or  light  source 


mask 


a)  wide  slit 

Fig.  12 


A  WIDE  SLIT  YIELDS  AMBIGUOUS  REGIONS 


To  compensate  for  this  problem  we  had  to  modify  the  config¬ 
uration  of  Figure  12a,  by  using  an  adjustable  slit  (see  Fig.  13). 
We  could  adjust  the  width  of  the  slit  to  the  smallest  that  would 
give  the  sharpest  possible  images. 

As  noted  above,  the  use  of  a  non-perfect  slit  would  degrade 
the  range  result,  especially  through  the  finest  masks  since  the 
width  of  the  slit  would  be  almost  the  same  as  that  of  the 
smallest  mask  division.  In  the  following  paragraph  this  degra¬ 
dation  is  calculated.  It  should  be  pointed  out  that  if  the  slit 
causes  degradation  for  a  given  mask,  all  the  masks  having  finer 
divisions  will  be  degraded  as  well.  For  example,  if  we  cannot 
have  a  sharp  image  of  the  mask  no.  6,  masks  no.  7  and  8  will  also 
be  degraded. 

As  explained  in  the  algorithm  description,  there  are  eight 
masks  (i-8)  which  will  result  in  the  constitution  of  an  8-bit 
code  relating  to  cot  o: 

cot  a  »  (decimal  ( Bn  B,  B,  B0  B,  B.  B.  B.))  x*constant  (■  -- — ) 

0  1  2  3  4  5  0/  fj 

(where  a  ■  width  of  mask's  bars,  and  fj»distance  from  mask  to 
slit ) 

Therefore,  if  mask  i  (i«0,...,7)  is  degraded  (i.e.,  gives  bad 
image)  the  resulting  code  will  have  degradation  in  bars  of 
the  finest  mask.  For  example  if  the  last  mask  (i*7)  does  not 
give  a  good  image,  the  code  can  be  Bq  Bj  B2  •••  B^O,  or  Bq  B1B2 
.  .  .  Bfcl  which  means  we  cannot  distinguish  between  the  n-th  bar 
and  the  (n+l)-th  bar  of  the  mask;  in  turn,  that  will  reduce  the 
range  precision.  Suppose  A  is  the  point  for  which  we  are 
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white  stripe  the  gray  field  will  affect  the  true  comparison.  It 
is  for  this  reason  that  we  use  the  'false-black'  as  a  more  proper 
name  for  'gray  area'. 


In  the  following  paragraphs  the  ratio  of  false  black  area 
to  correct  detected  area  is  computed. 

Fig.  15  shows  the  effect  of  false-black  (gray-area)  on  the 
accuracy  of  determining  range  of  an  object.  It  will  be  shown 
that  the  false-black  area  caused  by  a  stripe  (white  or  clear 
stripe)  of  the  mask  depends  only  on  the  number  n  of  white-stripe 
to  black-stripe  transitions,  the  width  of  the  slit  J,  the 
projector-mask  distance  f^,  and  the  range  of  the  object  R. 

Refering  to  similar  triangles  ABC  and  A'B'C'  we  have: 


A'B' 


1  • 


f 


1 


also  in  triangles  ABD  and  A"B"D  we  can  write: 


A"B" 


1  • 


f 


1 


Therefore : 

A'B'  ■  A"B"  *  gray  area  of  an  alternation  of  stripes 

-  (-S-  -  \)J  . 

1 

Hence,  for  a  mask  containing  n  alternations,  we  have: 

Total  false-black  area  ■  n  (—7—  -  l)i 

1 

The  ratio  of  false-detected  area  to  correctly-detected  area  due 
to  the  false-black  problem  i9  therefore: 

total  false-black  area 


where  M  is  che  mask  widen. 

B.5  Hardware-Algorithm  Modification 

The  DAD  box  compares  the  live  video  input  with  the  output  of 
the  reference  memory  (on  a  pixel-by-pixel  basis);  if  the  input 
pixel  is  less  than  or  equal  to  the  reference  pixel,  a  zero  is 
stored.  If  it  is  greater,  a  one  is  stored. 

However,  considering  'white'  the  highest  gray  level  and 
’black'  the  lowest  gray  level,  physically  it  is  not  possible  to 
have  live  inputs  with  gray  level  higher  than  reference  memory. 
Therefore,  we  had  to  invert  the  contents  of  reference  memory. 
This  could  be  done  using  TTL  inverters  (Table  1).  However,  it  was 
easier  to  invert  any  input  to  the  DAD  box. 

Experiments  with  the  modified  system  yielded  results  that 
were  incorrect  and  unpredictable.  Further  detection  showed  that 
the  Gray-code  to  binary-code  converter  was  causing  this  random¬ 
ness  and  unpredictability  because  inversion  of  the  reference 
memory  which  in  turn  would  invert  the  8  bit  code  of  8  bit-plane 
memories  (i.e.  instead  of  1 1011 10  the  code  0010001  will  be 
decoded).  This  resulted  in  the  need  for  another  inversion  right 
before  bit-plane  inputs  (Figure  16).  The  following  state  tables 
illustrate  the  above  reasoning: 


object  i  Ref.  Input  Bit-plane(Bl) 


1st  video  inout 


Dject 

Ref. 

Innut 

B2 

A 

0 

1  T 

1 

B 

0 

0 

0 

C 

0 

0 

0 

D 

0 

1 

1 

2nd  video  innut 


ect 

binary 

eauiv.  gray 

BiB2 

- - - — 

A 

00 

00 

11 

B 

01 

01 

10 

C 

10 

11 

00 

D 

11 

10 

01 

As  we  can  see  B-jB2  is 
exactly  inverse  of  true 
gray  code, 


Fig.  16 

CONSTRUCTION  OF  CORRECT  CODES  USING  INVERSION 


Ref . 

Input 

Bitplane 

Ref . 

Input 

Bitplane 

6 

0 

0 

I 

0 

0 

0 

1 

possible 

1 

i 

aS?owed 

1 

0 

6 

6 

6 

6 

I 

1 

6 

0 

l 

1 

we  have 

no  state 

’1' 

now  we 

have 

permissible 

state  ' 1 ' 

Table  1 


B.6  Look-up  Tables'  Restriction  of  Resolution  and  Dynamic  Range 

The  mapping  RAMs  (look-up  tables  (L.U.T.sj)  are  two 
separate  256  8-bit  word  memories  for  storage  of  the  values  of 
cot  a  and  cot  8.  Values  from  these  two  L.U.T.s  are  added  using 
an  eight-bit  A.L.U.  Therefore  the  following  considerations 
apply: 

a)  Since  the  mapping  RAMs  are  eight-bit,  we  are  allowed  to 
have  values  just  from  0  to  255.  The  largest  dynamic 
range  will  be  achieved  when 

a  b  , 

—r~  m  — —  ,  where 

rl  1 2 

a  is  the  finest  mask  division 
b  is  the  pixel  width 
fj  *  projector-mask  distance,  and 
f2  ■  camera  focal  length 

b)  Since  the  A.L.U.  output  is  eight  bits  we  cannot  use  the 
full  dynamic  range  of  the  L.U.Ts.  In  order  to  avoid 
overflow  we  have  to  use  half  of  dynamic  range  (0-127)  of 


the  L.U.T.  This  in  turn  implies  the  following  resolu¬ 
tion  problem:  since  the  addressing  of  these  memories  is 
by  autoincrement  and  the  readout  of  L.U.T.  proceeds  from 
Oth  to  255th  location,  and  the  corresponding  addition 
in  A.L.U.  also  takes  place  under  the  same  criteria,  we 
have  to  load  all  the  locations  (  256)  of  look-up  tables 
with  128  (0-128)  data.  Therefore,  every  two  consecutive 
locations  were  loaded  with  the  same  value.  It  is  shown 
in  Fig.  8  how  the  resolution  is  affected. 


Now,  cot  B  ■  — r~ 

2 


and  since 


~7~  cot  B  .  _  ■  cot  o 
f^  unit  unit 


where  o  .  is  the  illumination  angle,  with  respect  to  the 
unit  B  K 


reference  plane,  as  if  the  object  (or  a  point  on  the  object)  were 


illuminated  through  the  first  stripe  of  the  mask,  and  B  is 
6  r  unit 


the  imaging  angle,  with  respect  to  the  reference  plane,  as  if  the 
image  of  the  object  were  received  by  the  pixel  next  to  the  center 
of  CID  array. 

Now, 

b 


cot  a 


unit 


'1 


r  •  n  cot  a  +  (m+1)  cot  B  *  (n+m+1)  cot  a 
A  unit  unit  unit 


r  «  (n+1)  cot  a  .  ♦  (m+1)  cot  B  .  «  (n+m+2)  cot  a 
B  unit  unit  unit 


r  ■  (n+1)  cot  a  .  +  m  cot  B  .  ■  (n+m+1)  cot  a 
C  unit  unit  unit 


r  »  (n)  cot  a  .  ♦  m  cot  B  .  ■  (n+m)  cot  a 


According  to  the  preceding  explanation,  the  contents  of  m  and 
m+1,  and  n  and  n+1  in  the  look-up  tables  are  equal  respectively; 
therefore,  four  points  A,  B,  C,  and  D  will  be  determined  as  being 
in  the  same  range  from  'reference  plane.' 

The  resolution  is  affected  as  a  function  of  range  as 
follows:  (Figure  18  is  just  part  of  Figure  17)  as  we  can  see,  the 
resolution  depends  on  both  R  and  the  lateral  location  of  the 
object . 


hA  *  R  cot  a 
A 


_A  n  •  a 

R  "  f. 


since  right  sides  are  equal 


J  D 

R  +  AR 


D 

R  +  AR 


n  •  a 

f, 


R  .  D 
R  +  AR  "  h . 


A 


The  other  way  to  approach  the  resolution  problem  is  to 
observe  that  since 

1 _ 1  _  .  »  _  d 


AR  "td  (  ;  r:  -  - r -  )  tan  a  . 

( n+ra+1 )  n  +  m  unit 


+H  If  n+m-n-m-1  a  . 
j  ( n+m+1 )( n+m)  f^ 


AR  ■  id  • 


f  ^  ( n+m+1 )( n+m) 


in  which  is  a  function  of  hn<  h.  and  R. 

(n+m+1) (n+m)  D  A 


42 


B.7  Dark  Objects'  Range 

One  problem  implicit  in  the  method  of  implementing  the 
a'gorithm  is  concerned  with  dark  objects  (including  the  whole 
object  or  9ome  dark  spots  on  the  object  with  bright  background) 
or  holes  in  the  object.  The  problem  arises  from  the  fact  that 
whether  it  is  a  dark  object  or  a  hole,  and  whether  the  mask  is 
covering  this  area  or  not,  the  result  on  the  CID  will  be  the  same 
and  therefore  a  zero  will  be  stored  in  the  bit-planes.  There- 
I  fore  the  correct  code  will  not  be  encoded  for  that  area  but  will 

be  stored  as  0000000  »  0. 

I  B.8  Mask  Adjustment 

Another  fact  that  one  should  carefully  deal  with  is  mask 
adjustment.  The  masks  should  not  only  be  aligned  with  respect  to 
|  each  other  but  they  roust  also  be  aligned  with  respect  to  the 

rotating  drum  on  which  they  are  mounted  (i.e.,  parallel  to  the 
plane  of  rotation). 

S  Adjusting  the  masks  could  be  elaborate  (such  as  using  a 

computer  and  associating  the  video  inputs  with  different  masks). 
One  simple  procedure,  however,  is  to  use  an  overhead  projector, 

|  project  the  masks  on  the  board  and  adjust  them  by  comparing  the 

base  lines,  etc.  (This  is  the  method  we  used.)  The  problem  that 
might  occur  with  this  approach  is  that  if  the  first  mask  is  not 
|  parallel  to  the  rotation  plane,  then  all  the  other  masks  will  be 

tilted.  To  avoid  this  problem  we  must  first  level  the  projector 
parallel  to  the  floor  and  then  use  the  floor  as  a  reference  line. 

I 

i 


III.  Topics  investigated 


i 

i 

A.  Hardware  ' 

t 

i 

The  design  of  the  prototype  structured-light  source  used  a 
conventional  high-fidelity  turntable  that  had  been  modified  by 

the  erection  of  a  metal  wall  about  the  circumference  of  (but  not  I 

attached  to)  the  platter.  A  hole  (of  the  size  and  shape  of  one 
mask)  was  cut  into  the  wall  at  the  front  of  the  device,  and  a  top 

fabricated  to  confine  the  light  so  that  it  emerged  only  from  the  I 

hole  in  front.  A  second  wall  was  built  on,  and  attached  to  the 

circumference  of,  the  rotating  platter  of  the  turntable.  A  set 

of  nine  holes  was  cut  into  that  wall  for  the  insertion  of  the 

glass  masks  (eight  bit-plane  masks  and  one  reference-image  mask) 

described  above.  At  the  center  of  the  interior  of  this  circular 

section  was  the  light  source  (the  photographic  flash  unit), 

suspended  from  a  bar  attached  to  the  top  of  the  outer  wall,  and 

hence  stationary. 

The  original  goal  was  the  real-time  (30  frames  per  second)  I 

production  and  processing  of  range  images;  a  30  rotation-per- 
second  rate  of  the  turntable  would  have  therefore  been  required. 

For  reasons  described  below,  processing  at  that  rate  was  not  | 

possible,  and  the  masks  were  rotated  past  the  front  opening  by 
hand,  at  a  very  low  speed. 


The  DAD  box  (so  called  because  of  its  manufacturer,  Digital- 
Analog  Designs,  Inc.)  was  designed  to  provide  rapid  storage  of 
the  eight  bit-plane  images,  and  their  aggregation  for  definition 
of  the  eight-bit  word  needed  to  describe  each  pixel.  The  box 
(see  technical  description  in  Appendix  C)  also  contained  a  fast 
arithmetic  logic  unit  and  the  look-up  tables  described  above. 
These  latter  two  elements  were  intended  to  create  range  images 
rapidly,  using  the  techniques  described  above.  The  DAD  box  was 
controlled  by  a  Franklin  1000  microcomputer,  for  which  software 
was  written  ( see  description  and  listings  in  Appendix  D)  to 
trigger  and  synchronize  the  light  source.  Although  the 
software  could,  in  principle,  operate  at  a  speed  that  would 
permit  the  30-image-per-second  rate,  problems  with  the  turntable 
speed  control  kept  it  turning  too  rapidly  for  the  system. 
Accordingly,  efforts  were  aimed  at  providing  proof-of-principle 
rather  than  achieving  maximum  speed.  The  turntable  was  therefore 
rotated  by  hand,  pausing  at  each  of  the  nine  masks  to  illuminate 
the  object  and  record  the  result  with  the  CID  camera. 


B.  Software  -  Range  Data  Recognition  and  Object  Matching 
B.l.  Introduction 

This  section  describes  a  simulation  system  for  range  data 
recognition  and  the  object  matching.  In  fact,  it  is  a  range 
image  processing  algorithm  which  uses  the  data  obtained  by  our 


computer  vision  system. 


The  simulation  system  began  by  generating  the  range  data 
which  are  the  distances  from  the  sensor  plane  to  the  surfaces  of 
a  convex,  plane-faced  3-D  object  (in  a  specific  orientation). 
The  range  image  processing  program  is  composed  of  a  range  data 
recognition  algorithm  and  a  single  viev-to-3D  object  matching 
algorithm. 

The  goal  of  the  range  data  recognition  algorithm  is  to  find 

(i)  how  many  planes  in  the  range  image 

(ii)  the  plane  equations 

(iii)  the  node  coordinates  of  the  object  (visible  in  the 
single  view) . 

The  matching  algorithm  is  used  to  prove  whether  this  single 
view  is  one  of  the  views  of  the  reference  object  or  not. 

Because  the  range  data  image  is  a  kind  of  intensity  image, 
some  of  the  mature  techniques  in  intensity  image  processing  were 
applied  to  our  range  image  processing  algorithm,  including  image 
segmentation,  histogram  smoothing  and  chain-code  boundary 
representation.  The  simulation  system  achieved  the  desired 
results,  but  it  is  useful  only  for  convex,  plane-faced  objects. 

B.2  Simulation  System 


Fig.  1  shows  the  simulation  system  we  used.  Plate  1  is  a 


reference  object 
(idth  known  node  coor.) 


SUMMARY  OF  THE  SIMULATION  PROCEDURE 


Plate  2 


A  simulated  range  image,  mapped  as  gray  levels.  Darker 
pixels  represent  smaller  distances 


As  an  example,  we  chose  a  prism  sitting  on  the  X-Z  plane  as 
our  reference  object;  it  had  six  nodes  and  five  planes.  As  a 
convex,  plane-faced  object,  it  may  be  represented  by  just  the 
node  coordinates,  which  will  be  used  in  later  object  matching 
procedures . 

After  a  simple  geometric  transformation  of  the  reference 
object  the  rotated  reference  object  with  known  node  coordinates 
and  plane  equations  is  chosen  for  range  data  generating.  The  X-Y 
plane  is  the  sensor  plane  and  the  z  values  are  the  range  values 
which  would  be  represented  in  our  range  data  image  by  the  gray 
levels.  Plate  2  shows  the  intensity  range  image  displayed  on  the 
RAMTEK  9400.  The  darker  the  pixel  on  the  image,  the  smaller  the 
distance . 

In  order  to  examine  the  results  of  the  range  data 
recognition  algorithm,  we  store  the  3-D  node  coordinates  and  the 
parameters  of  the  plane  equations  (i.e.  A,B,C,D  values  in  plane 
equation  AX+BY+CZ“D,  with  D  equal  to  either  1  or  0). 

The  artificial  range  data  are  then  sent  to  the  range  data 
recognition  system.  The  outputs  of  the  system  are  the  number  of 
the  planes  (in  this  specific  view),  the  plane  equations  and  the 
node  coordinates  (visible  in  the  view).  One  can  compare  the 
outputs  with  the  original  data  which  are  used  to  generate  the 
range  data  image  to  estimate  the  performances  of  the  range  data 
recognition  system. 
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dz/dx=IOBJ ( x+1 ,y)  -IOBJ(x,y) 


dz/dx  * 


smothing 


RANGE  DATA  I OBJ (x ,y) 

—  f _ 


dz/dy=IOBJ(x,y+l) -IOBJ(xfy) 

~~T~ - 

dz/dy  *  k 

r-1— ; 

smothing 


i=i+l 


plane  equation  finding 
AiX+BiY+ciZ=Di 


individual  plane  image 
finding 


node  coordinate  finding 


- - 

yes 

(stop) 

FLOWCHART  OF  PLANE-DESCRIPTION  PROCEDURE 
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The  object  matching  system  receive  the  information  from  both 
the  reference  object  and  the  single  view.  Comparing  the  node 
coordinates  and  plane  equations,  one  can  easily  make  the  decision 
—  matching  or  mismatching. 

B.3  Range  Data  Recognition 

The  range  data  recognition  algorithm  starts  with  the  range  data 
image  IOBJ(x,y),  and  contains  the  following  three  steps: 

(i) .  plane  segmentation 

(ii) .  plane  equation  finding 

(iii)  node  coordinate  finding 

Fig.  2  is  the  flowchart  of  the  range  data  recognition  system. 

We  separate  the  planes  in  our  single  view  image  based  on  the 
different  orientation  of  the  individual  plane.  To  find  the 
orientation  of  the  individual  plane,  it  is  necessary  to  compute 
the  slopes  in  both  X  and  Y  directions  (i.e.  first  partial 
derivative  dz/dx  and  dz/dy).  Plates  3(a)  and  (b)  show 
respectively  the  slopes  dz/dx  and  dz/dy.  The  area  which  is 
brighter  than  the  background  has  positive  value  of  the  first 
partial  derivative,  and  the  darker  area  has  negative  values. 
Since  every  plane  has  its  own  values  of  dz/dx  and  dz/dy,  it  is 
possible  to  separate  the  planes  in  the  2-D  slope  histogram  (Fig. 

3)  by  simple  peak  detection. 
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Plate  3 


Images  ot  slope  mapped  as  intensity:  (a)  dz/dx 


1 


-■  ^  w  ^  . 


■»  .  v :  i  ;  t,  -v 


H.  ■_  *  -  ■  ^  ^  -  *  v  '•  J  ' 


An  inherent  probiem  in  any  digitized  intensity  image  is  the 


presence  of  discontinuities  in  the  first  derivative  because  of 


the  limited  gray  levels.  This  problem  causes  the  peak  spreading 


and  even  overlap  in  our  2-D  slope  histogram.  For  instance,  for 


che  very  shallow  (nearly  horizontal  to  X  or  Y  axes)  planes  we 


might  find  slope  value  0,1  in  one  plane  and  0,-1  in  another  plane 


(see  Fig.  4(a))  so  that  there  would  be  an  overlap  in  the  slope 


histogram  (Fig.  4(b))  and  therefore  no  way  to  separate  these  two 


planes . 


To  solve  the  problem,  we  multiply  the  slope  value  by  a 


factor  k  (say  k»5)  to  increase  the  number  of  the  gray  levels 


(Fig.  5(a)).  After  the  smoothing  procedure  (Fig.  5(b)),  one  can 


easily  find  the  separated  peaks  in  the  slope  histogram  (Fig. 


5(c)).  It  is  then  no  problem  to  separate  these  two  planes. 


Each  strong  peak  in  the  2-D  slope  histogram  implies  the 


presence  of  a  distinct  plane  in  our  range  data  image.  Taking  the 


peak  as  the  center  and  considering  the  8  neighbors  of  the  peak 


(see  Fig.  6)  we  can  label  the  pixels  which  have  their  dz/dx 


values  between  (dz/dx)m^n  and  (dz/dx)oax  and  their  dz/dy  values 


between  (dz/dy)win  and  (dz/dy)aax  as  having  the  same  value  (see 


Plate  4) . 


Because  of  the  slope  smoothing  procedure  the  boundaries  of 


the  planes  were  blurred.  The  pixels  around  the  boundaries  were 


reasonably  rejected  during  the  plane  labeling.  The  planes 


appearing  in  Plate  4  are  smaller  than  those  they  represent,  but 


this  margin  makes  for  more-confident  estimates  of  the  planes' 


equations . 


n 


4 


EFFECT  OF  LIMITED  DYNAMIC  RANGE,  (a)  Cross-Section  of  two  planes  meeting 

at  shallow  angle  (b)  The  corresponding  histogram,  and  its  poorly-resolved  peaks 
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Fig.  5 

ENHANCEMENT  OF  THE  EXAMPLE  OF  FIG. 

(a)  multiplication  by  a  constant 

(b)  smoothing  (c)  improved  histogram 
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Picking  up  3  points  on  a  labeled  plane  (choosing  them  to  be 
as  far  apart  as  possible),  one  can  find  the  plane  equation  in  the 


form  of  AX+BY+CZ*D  (D*l  or  0). 

Once  we  learn  the  plane  equation,  we  can  readily  solve 
several  problems: 

(i)  to  find  the  relationship  between  the  planes  by  their 
plane  equation  parameters  Ai,  Bi,  Ci,  Di. 

(ii)  to  find  the  individual  plane  images  (i.e.  the  pixels 
whose  IOBJ(x,y)  value  satisfies  the  plane  equation).  Plates 
5(a),(b),(c)  show  the  individual  planes  which  were  finally 
found.  They  are  exactly  the  original  planes. 

(iii)  searching  the  boundary  pixels  in  the  individual  plane 
image  and  using  the  chain  code  as  the  indicator  of  the 
directions,  one  can  find  the  turning  points  in  the  boundaries 
(i.e.  the  nodes  of  the  object)  in  terms  of  the  changes  in  the 
chain  code. 

The  outputs  of  our  range  image  recognition  program  —  the 
number  of  planes,  the  plane  equations,  and  the  visible-node 
coordinates  —  offer  enough  information  for  the  object  matching 
procedure . 

B.4  Object  Matching 

The  object  matching  program  will  answer  the  question  -  is 

this  single  view  a  specific  view  of  our  reference  object?  The 
matching  procedure  includes  the  following  three  steps  (see  Fig. 
7): 


(i)  bottom-node  matching 


MISMATCHING 


Fig.  7 


FLOWCHART  OF  THE  OBJECT-MATCHING  PROCEDURE 
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(ii)  plane-equation  matching 

(iix)  aii-node  matching 

The  node  coordinates  and  plane  equations  of  the  single  view 
are  found  from  the  range  image  recognition  program,  and  those  of 
the  reference  object  are  assumed  known  a  priori . 

It  was  assumed  that  the  object  we  are  observing  has  its 
bottom  plane  parallel  to  the  X-Z  plane.  Therefore,  one  can 
choose  one  of  the  planes  of  our  reference  object  as  the  bottom 
plane  and  let  it  be  parallel  to  X-Z  plane  (one  of  the  possible 
positions)  when  we  start  the  matching  procedure. 

A  parallel  transformation  is  needed  for  both  reference 
object  and  the  object  being  observed  in  order  to  compare  their 
node  coordinates  and  plane  equations.  Putting  a  bottom  node  at 
the  origin  (see  Fig.  8)  we  can  first  compare  the  coordinates  of 
the  bottom  node.  Ic  means  to  compute  the  total  error  in  3-D 
coordinates 


N  1/2 

e.  -  y*  f  (x  .  -  X  ,)2  +  (y  . -Y  .  )2  +  (z  .-  z  ,)2  1 

b.n  L  ri  01  «  01  rl  01  J 

i-1 

where  N  is  the  number  of  nodes: 

x  v  z  .  are  the  node  coordinates  of  the  reference  object; 
n’  •  n  ri  J 

x  . ,  y  . ,  z  .  are  the  node  coordinates  of  the  observed  object 
oi ’  7oi  oi  J 


Usually  a  rotation  of  the  reference  object  about  the  Y  axis  has 
to  be  done  before  we  learn  whether  the  bottom  nodes  are  matching 
or  not.  If  it  is  not,  we  would  put  another  bottom  node  of  our 


reference  object  (but  in  the  same  position)  to  the  origin  and 


compare  it  to  the  observed  object  again.  If  it  is  still  not 
matching  after  we  try  all  bottom  nodes  in  this  position  at  the 
origin,  we  will  change  the  position  of  the  reference  object  and 
go  through  the  same  bottom  node  matching  procedure  again.  If  it 
does  match  we  will  proceed  with  the  plane  equation  matching 
program.  One  can  simply  compute  the  difference  between  the 
parameters  of  the  plane  equations 


p.e. 
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where  m  is  the  number  of  planes,  and 

Ari»  Bri’  ^ri»  ^*ri  are  parameters  of  the  plane  equation  of 

reference  object; 

Aoi>  Boi>  coi»  D0i  are  the  parameters  of  the  plane  equations  of 
observed  object. 

Finally,  we  invoke  the  all-node  matching  program  if  the  bottom 
node  matching  and  the  plane  equation  matching  have  passed.  The  goal 
of  the  all-node  matching  program  is  to  check: 

(i)  if  all  nodes  we  found  from  the  range  image  can  find  their 
partners  in  the  reference  object; 

(ii)  if  all  extra  nodes  in  the  reference  object  are  behind  the 


surface  of  our  specific  view. 


B.5  Results  and  Conclusions 


To  examine  the  simulation  system,  we  compared  the  original  node 
coordinates  and  plane  equations  which  we  used  to  generate  the  range 
image  and  those  we  finally  found  using  the  range  image  recognition 
system.  The  results  are  presented  in  Table  I  and  Table  II  respec¬ 
tively.  The  total  error  for  all  18  node  coordinates  was  10.87  and  the 
error  between  the  coordinates  of  the  plane  equations  was  0.01268. 
This  results  show  the  good  performance  of  our  recognition  system.  In 
addition,  the  matching  procedure  has  an  advantage  of  requiring  both 
rather  small  memory  space  and  the  amount  of  the  computation  because 
the  whole  matching  procedure  does  not  require  image-sized  data 
operations.  (Appendix  E  contains  listings  of  the  software  used.) 

IV.  Conclusions  and  implications  for  the  Navy 

The  results  presented  here  indicate  that  structured  light  is  a 
feasible  technique  for  real-time  ranging.  Equally,  regardless  of  how 
a  range  image  is  acquired,  useful  descriptions  of  objects  can  be 
extracted  from  such  an  image.  The  DAD  box  appears  to  be  a  fast 
processor  that  can  use  look-up  tables  to  compute  range  in  real-time, 
and  should  be  considered  for  integration  into  structured-light 
systems. 

Clearly,  further  work  must  be  done  to  examine  the  mechanical 
aspects  of  sweeping  the  masks  past  the  light  source,  andthe  intensity 
of  the  light  must  be  augmented.  (An  alternative  to  increasing  the 
intensity  of  the  source  is  the  use  of  a  photomultiplier  at  the  camera 


input.  Efforts  were  made  in  this  direction  using  an  NBS-supplied 
low-light-level  [LLLj  camera.  Irreconcilable  optical  incompatibilities 
between  the  LLL  camera  and  the  CID  camera,  however,  made  it  impossible 
to  make  a  realistic  test  of  the  benefits  of  such  an  approach.) 

Possible  disadvantages  of  this  kind  of  system  include  the  need 
for  a  very  fine  slit  to  reduce  the  size  of  the  region  of  uncertainty 
cast  by  the  masks.  It  is  the  requirement  for  the  fine  slit  that  leads 
to  the  need  for  a  high-intensity  light  source  (as  noted  above).  The 
system,  as  it  uses  visible  light,  must  operate  in  a  darkened 
environment  and  hence  is  probably  not  suitable  where  human  activity  is 
also  involved. 

In  many  manufacturing-related  applications,  however,  structured 
light  is  likely  to  be  very  useful.  Opportunities  include  inspection 
tasks:  for  completeness  of  structure,  or  quality  of  surface  finish; 
and  assembly  tasks:  for  parts  matching  and  orientation.  Many  of  the 
precision  and/or  hazardous  manufacturing  and  handling  responsiblities 
of  the  Navy  appear  to  be  reasonable  candidates  for  adoption  of  this 
technology,  which  requires  of  necessity  a  relatively  static  platform 
for  installation  of  the  equipment.  It  is  thus  far  better  suited  to 
the  manufacturing  and  quality  control  environments  than,  for  example, 
to  a  mobile-robot  application. 

Several  research  papers  which  bear  on  the  range  of  prospective 
applications  of  this  kind  of  a  svstem  appear  in  Appendix  F. 
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591  /  .062159/54/  .385869565  .114130435  .239995264 
18341  . .124319509  .7/1/39131  .2282608/  .479990528 
7/5.12  .  1864/9264  1.1576087  .342391304  .719985792 
36682  .248639019  1.54347826  .456521739  .959981056 
79585  .310/98773  1.92934783  .570652174  1.19997632 


The  following  plot  shows  the  sensitivity  of  the  computed  range  by 
the  system  at  a  specific  camera  tilt  (c),  object  range  (R), 
camera-projector  (d),and  the  changein  range  (aR)  versus  the 
change  in  object  vertical  location* 

a)  aR=.01 

d=l  > 

R=1  >  solid  black  line 

c=90°  5 

d-1  | 

R=1  >  Red  line 

C=75°  ’ 


R=1  >  dashed  line 

c=90°  ' 


b)  aR=.001 

same  as  above 


c)  aR=.0001 

same  as  above 


The  attached  plot  shows  the  changes  detected  in  range  (AR*)  due 
to  a  change  made  in  camera  tilt  (c)  for  the  following  parameters: 


AR  =.01  a  change  in  range 

Camera-projector  distance  <d)  =  2  , 

Range  (R)  =1  > 

red  line 

Height  of  the  object  (y) 

camera-project  distance  (d) 
range  of  the  object  (R) 

=  .5  ’ 

:i  } 

solid  black  line 

Height  of  the  object  (y) 

d=l 

R=2 

y=.5 

=  .5  > 

1 

dashed  line 

b)  ar= .001 

same  as  above 

C)  AR= .001 

same  as  above 


Therefore  the  plot  shows  the  sensitivity  of  the  computed  range 
with  respect  to  a  change  in  range  of  the  object  versus  the  camera 
tilt.  Note  that  the  resolution  (pixel  size)  in  the  CID  is  3bjim. 
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1.0  System  Introduction 

The  DVDS  Model  7  with  Comparator/ SI leer  Option  can  be  uaed  for 
adaptive  thresholding  of  real-time  video  laagea.  A  full  video 
field  can  be  atored  In  the  reference  aaory  to  8  bit  greyacale 
resolution.  The  stored  image  can  then  be  compared  with  the  real¬ 
time  video  Input  and  the  resulting  one  bit  Image  stored  In  the 
other,  bit  plane,  memory.  Eight  different  fields  may  be 
processed  and  stored  In  this  way.  The  bit  plane  memory  may  be 
displayed  directly  or  through  a  256  z  8  bit  Mapping  RAM  which 
Implements  an  arbitrary  programmable  function.  In  addition,  the 
host  computer  can  load  or  unload  the  reference  memory  and  unload 
the  bit  plane  memory  either  directly  or  through  the  Mapping  SAM. 

1.1  Controller 

The  controller  card  contains  the  master  clock  oscillator,  PROMs 
with  all  the  necceasary  timing  and  sync  signals,  and  the  circuit¬ 
ry  needed  for  controlling  the  dynamic  RAMs  used  In  the  Image 
memories. 

1.2  Image  Memories 

There  are  two  memory  cards,  the  reference  and  the  bit  plane. 
Each  memory  card  uses  16K  dynamic  RAMs  arranged  256  H  by  240  V  by 
8  deep.  They  run  In  real  time,  synchronous  with  the  video  dis¬ 
play.  Since  the  RAMs  do  not  access  fast  enough,  they  are  multi¬ 
plexed  4  deep.  For  a  read  cycle,  four  pixels  are  addressed  In 
parallel  and  are  serially  shifted  out.  The  write  cycle  works 
similarly  but  In  the  opposite  direction. 

1.2.1  Reference  Memory 

The  reference  memory  stores  an  eight  bit  gray  scale  Image.  It 
may  be  filled  with  data  from  the  video  Input  or  directly  written 
into  through  the  DMA  interface  from  the  host  computer.  It  may 
be  displayed  through  the  video  output  or  unloaded  through  the  DMA 
Interface  to  the  computer. 


1.2.2  Bit  Plane  Men or y 


The  bic  plane  asaory  la  configured  aa  eight  256  X  240  x  1  bit 
planes.  Input  data  to  any  of  the  planes  coaes  froa  a  magnitude 
comparator.  This  coapares  the  live  video  input  froa  the  A/D  with 
the  laage  stored  in  the  reference  aeaory  on  a  pixel  by  pixel 
basis.  An  offset  aay  be  specified  which  is  added  to  the  pixel 
values  froa  the  reference  aeaory.  This  provides  noise  iaaunity 
when  using  the  reference  aeaory  to  reaove  differential  scene 
lighting.  If  the  input  pixel  is  less  than  or  equal  to  the  refer- 
ence  pixel  (plus  offset),  a  zero  is  stored.  If  it  is  greater,  a 
one  is  stored.  When  the  reference  pixel  value  plus  offset  ex¬ 
ceeds  255  the  value  stored  in  the  bit  plane  is  autoaatlcally  xero 
since  the  input  pixel  aust  be  less  than  256.  Only  one  plane  aay 
be  written  into  at  a  tine.  This  aeaory  aay  be  displayed  through 
the  the  video  output  or  unloaded  to  the  coaputer  through  the  DMA 
Interface.  The  RAM  look-up  table  aay  be  placed  between  the  bit 
plane  aeaory  and  either  of  these  output  channels  (see  1.5). 

1.3  Address  Generator 

This  card  generates  the  addresses  used  to  read  from  and  write  to 
the  aeaory  in  the  video  aodes.  It  also  generates  the  refresh 
addresses  for  the  dynamic  RAMs.  Its  addressing  nodes  are  fixed. 


1.4  Analog  Input/Output 

This  card  is  the  Interface  froa  the  camera  to  the  fraae  store  and 
froa  the  fraae  store  to  the  CRT  monitor.  The  video  input  from 
the  caaera  is  low-pass  filtered,  amplified,  and  digitized  at  5 
MHz  to  6  bit  gray  scale  resolution.  The  digitized  video  is  then 
put  onto  the  systea  bus  destined  for  the  asaory.  For  video 
output  there  is  a  selector  that  allows  the  viewing  of  any  of 
three  sources,  the  reference  aeaory,  the  bit  plane  aeaory  and  the 
bit  plane  aeaory  through  the  Mapping  RAM.  With  any  of  these 
sources  either  the  entire  gray  scale  or  any  of  the  eight  individ¬ 
ual  planes  aay  be  displayed.  The  translation  froa  digital  to 
analog  is  done  at  5  MHz  in  an  8  bit  converter. 

1.5  Coaputer  Interface 

The  coaputer  Interface  card  is  the  link  between  the  fraae  store 
and  the  host  coaputer.  It  uses  sn  8  bit  bidirectional  data  bus 
and  an  8  bit  address  bus.  This  card  allows  the  user  to  program 
the  fraae  store  into  its  various  aodes.  It  also  has  its  own 
address  generator  for  the  iaage  aeaorles  that  allows  bidirection¬ 
al  Direct  Memory  Access  (DMA).  A  256  x  8  bit  RAM  look-up  table 
is  Included  for  arbitrary  translation  of  bit  plane  asaory  data. 
This  can  operate  on  data  output  to  the  DAC  or  through  the  DMA 
channel. 


1.6  Case  and  Electrical 


The  Prut  store  is  packaged  In  an  S-100  type  computer  case.  The 
motherboard  has  been  modified  by  adding  an  A.C.  termination  on 
each  line.  It  uses  standard  S-100  wire  wrap  cards.  There  is  an 
Internal  power  supply  Chat  uses  120  VAC  +-  102.  The  on/off 
switch  has  a  key  lock  and  is  located,  along  with  the  power  indic¬ 
ator  LED,  on  the  front  panel.  Due  to  several  internal  reset 
circuits,  after  turning  the  unit  off,  the  user  should  wait  10 
seconds  before  turning  the  unit  back  on.  All  video  connections 
are  made  with  BNC  connectors  located  on  the  back  panel.  The 
computer  Interface  is  via  a  37  pin  subminiature  D  type  connector 
on  the  back  panel.  The  fuaeholder  is  also  on  the  back  panel. 
The  fuse  is  a  3  amp  slo-blo  type. 


2.0  Computer  Interface 


2.1  Description 

The  DVDS  is  designed  to  be  a  peripheral  to  a  host  computer.  Its 
modes  of  operation  are  programmable  and  once  programmed  it  needs 
no  further  intervention  to  operate.  Programming  Is  done  by 
loading  registers  in  the  DVDS  that  control  specific  functions. 

Two  8  bit  output  ports  must  be  available  from  the  computer.  One, 
the  command  port,  specifies  which  register  is  to  be  loaded.  Ihe 
other,  the  output  data  port,  carries  the  data  to  be  loaded.  See 
paragraph  2.2  for  the  list  of  registers.  These  registers  are 
write  only,  that  is  they  cannot  be  read.  Because  the  DVDS  uses 
the  handshaking  line  on  the  command  port  to  initiate  register 
loading,  the  data  port  must  be  loaded  before  the  command  port. 
Since  the  registers  may  come  up  at  random  on  power  up,  they 
should  be  initialised  to  a  known  condition  before  the  DVDS  is 
used.  This  is  called  booting  the  system.  See  the  program 
listing  "BOOT".  An  output  port  from  the  DVDS  is  used  to  unload 
the  image  memory  to  the  host  computer. 


2.2  List  of  Registers,  Addresses,  end  Modes 

2.2.1  Conputer  Interface 

Address 

0  Bit  0:0*  register  loading  during  vertical  drive  only 

1  •  register  loading  during  entire  frame 
All  other  bits  are  "don't  care" 

1  Bit  0  0  -  Reference  Memory  Live 

1  ■  Reference  Memory  Frozen 

Bit  1  0  -  Bit  Plane  Memory  Frozen 

1  ■  Bit  Plane  Memory  Live 

Bit  2  :  0  -  DMA  Output  Mode 

1  —  DMA  Input  Mode 

Bit  3  :  0  •  Normal  (Video  IN,  OUT)  Mode 

1  -  DMA  Mode 

Note:  In  the  DMA  mode  both  memories  are 
automatically  prevented  from  writing  In 
data  from  the  A/D  and  the  reference  memory 
Is  displayed  irregardlesa  of  any  other 
controls. 

All  other  bits  are  "don't  care" 

2  DMA  Horizontal  Address  Register  (0-255) 

3  DMA  Vertical  Address  Register  (0-255) 

4  DMA  enable.  During  s  read  this  Is  a  flag,  that  Is,  It 
Initiates  the  read  function,  during  write  It  contains 
Input  pixel  data  0-255. 

5  Bit  0:0*  Display  Reference  Memory 

1  ■  Display  Bit  Plane  Memory 

Bit  1:0-  Direct  Memory  Display 

1  ■  Display  Bit  Plane  Memory  through  Mapping  RAM 

Bit  2:0-  Display  Full  grey  scale 

1  -  Display  Single  Bit  of  selected  Memory 
Note:  The  Single  Bit  display  of  either  the 

Reference  Memory,  the  Bit  Plane  Memory  or  the 
Bit  Plane  through  the  Mapping  RAM  may  be  chosen. 
The  bit  is  chosen  by  Register  6. 


Bit  3:0s  A/D  on  Input  Data  But 

1  >  Coanand  Data  Bua  on  Input  Data  Bua 
Nota:  This  allows  Data  fro*  the  Host  Computer 

to  ba  placad  on  tha  Bit  Plana  Memory  Input 
rathar  than  tha  Input  Vidao.  This  la  useful 
for  tasting. 

All  othar  Bits  ara  "don't  care" 

6  Bits  0*2  :  Writing  into  the  Bit  Plana  Maaory  occurs  one 
plane  at  a  tine.  These  three  bits  determine 
which  plana  (0*7)  is  active  (live). 

For  tha  display  of  a  single  bit  of  one  of  the 
output  sources  (Bagister  5  Bit  2  High)  register 
6  selects  the  bit  to  be  displayed. 

All  other  Bits  ara  "don't  care" 

2.2.2  Mapping  RAM 
Address 

I  224  Bit  0:0"  Normal  Operation 

1  "  Load  Masking  RAMs 

22S  write  enable  for  Mapping  RAM  (0*255) 

227  write  address  register  (0*255) 

i 

2.2.3  Bit  Plane  Menory 
Address 

255  Bit  Plana  Memory  Offset  Register  (0*255) 


Note:  This  adds  a  selectable  bias  to  the 

data  fresi  the  rafarence  memory  before 
being  compared  with  data  from  the 
A/D  for  the  bit  plane  memory.  This 
is  useful  for  removing  the  effects 
of  noise. 


2.2.4  Operating  Modes 


Mode 

Register  1 

Register  5 

1. 

Both  Manor les  Prosen, 

Reference  Meaory  Displayed. 

1 

0 

2. 

Reference  Menory  Live,  Bit 

Plane  Meaory  Ftosen,  Reference 
Meaory  Displayed. 

0 

0 

3. 

Both  Meaories  Prosen, 

Bit  Plane  Meaory  Displayed. 

1 

1 

4. 

Bit  Plane  Meaory  Live,  Reference 
Meaory  Prosen,  Bit  Plane  Meaory 
Displayed,  Register  6  contains  the 
Plane  Ntnber  which  is  Live  (0-*7). 

3 

1 

5. 

Both  Meaories  Prosen, 

Display  Bit  Plane  Meaory  through 
Mapping  Meaory 

1 

3 

6. 

Bit  Plane  Meaory  Live, 

Display  Bit  Plane  Meaory  through 
Mapping  RAM.  Register  6 
contains  the  Plane  Nuaber  which 
is  Live  (0-7). 

3 

3 

7. 

Both  Meaories  Prosen,  Reference 
Meaory  Displayed,  DMA  Write  into 
Reference  Meaory. 

8 

0 

8. 

Both  Meaories  Prosen,  Reference 
Meaory  Displayed,  DMA  read  froa 
Reference  Meaory. 

13 

0 

9. 

Both  Msaorles  Prosen,  Bit  Plane 
Meaory  Displayed,  DMA  read  froa 

Bit  Plane  Meaory. 

13 

1 

10. 

Both  Meaories  Prosen,  Bit  Plane, 

DMA  read  froa  Bit  Plane  Meaory 
through  the  Mapping  RAM. 

13 

3 

Mote:  In  any  of  these  nodes  a  single  plane  of  the  output 

source  nay  be  viewed  by  adding  4  to  the  maber  in  Register  5  and 
loading  Register  6  with  the  Plane  nuaber  (0-7) 


2.3  Direct  Meaory  Access  (DMA) 
2.3.1  Introduction 


The  DVDS  allows  the  host  computer  to  directly  access  the 
■CBorles.  Frozen  1 wages  froa  either  meaory  or  froa  the  bit  plane 
aeaory  through  the  Mapping  RAM  nay  be  unloaded  to  the  eoaputer 
via  the  high  speed  DMA  interface.  Additionally,  the  reference 
aeaory  can  be  directly  loaded  froa  the  eoaputer.  The  transfer 
rate  in  either  direction  can  exceed  200K  bytes/second.  A  DMA 
address  generator  is  used  for  both  read  and  write  operations. 
Pixel  coordinates  nay  be  loaded  into  the  horizontal  and  vertical 
address  registers  for  randoa  access  of  the  Image  aeaory. 
Additionally,  the  DMA  address  generator  will  autolncreaent  after 
each  read  or  write  operation.  This  allows  a  starting  address  to 
be  specified  and  then  the  circuit  will  provide  sequential 
addresses  Itself.  This  speeds  the  data  transfer  rate 
substantially.  The  top  left  pixel  is  at  address  OH,  0  V.  The 
top  right  pixel  is  at  address  253  H,  0  V.  The  bottoa  left  pixel 
is  at  address  0  H,  240  V.  The  bottoa  right  pixel  is  at  location 
255  H,  240  V.  The  addresses  will  autolncreaent  froa  the  last 
address  loaded  into  the  horizontal  address  register  until  255  is 
reached.  Then  the  vertical  address  will  increment  by  one  and  the 
horizontal  address  overflows  back  to  0. 


2.3.2  DMA  Read 

The  algorithm  for  perforalng  DMA  read  is  outlined  below.  See 
also  the  prograa  listings  for  REF. READ  and  M0VL0D.0BJ0 

1.  load  register  0  with  1  -  allows  DMA  access  during  entire 

fraae. 

2.  load  register  1  with  13  -  this  freezes  the  images  and 
prepares  for  a  DMA  read  operation. 

3.  load  register  5  with  the  proper  code  for  the  desired 
output  eource  (0  ■  Reference  Meaory,  1  »  Bit  Plane 
Memory,  3  •  Bit  Plane  Meaory  through  the  Mapping  RAM) . 

4.  load  register  2  (DMA  horizontal  address  register)  with 
H  pixel  coordinate. 

5.  load  register  3  (DMA  vertical  address  rsgister)  with  V 
pixel  coordinate. 

6.  loed  register  4  to  assert  the  DMA  read  at  the  previously 
specified  address.  The  pixel  will  be  sent  to  the  com¬ 
puter  via  the  DVDS  output  port. 


7.  for  random  access  go  to  4.  for  sequential  access  go  to 

6.  and  repeat  until  the  desired  ntaiber  of  pixels  are 
recleved. 

8.  load  register  1  with  1.  -  this  returns  the  system  to  Its 
normal  state  and  keeps  the  laages  frozen. 

Note;  The  handshaking  lines  should  be  checked  after  each 
step 


2.3.3  DMA  Write  (Reference  Memory  Only) 

The  proceedure  for  performing  DMA  write  ie  similar  to  the  DMA 
read.  Also  see  the  progm  listing  REF. RAMP  and  MOVLOD.OBJO 

1.  load  register  0  with  1  -  this  allows  DMA  access  during 
the  entire  frame. 

2.  load  register  1  with  9.  -  this  freezes  the  Image  and 
prepares  the  unit  for  write. 

3.  load  register  5  with  0  -  this  displays  the  reference 

memory 

4.  load  register  2  with  the  pixel  H  coordinate. 

5.  load  register  3  with  the  pixel  V  coordinate. 

6.  load  reglater  4  with  the  pixel  to  be  written  Into  the 

Image  memory. 

7.  for  random  access  go  to  4.  for  sequential  access  go  to 
6.  and  repeat  until  the  desired  mmber  of  pixels  are 
transferred. 

8.  load  register  1  with  1  -  this  rsturns  the  unit  to  nor¬ 
mal  operation  with  the  Images  frozen. 

Note:  The  handshaking  line  should  be  checked  after  each 


2.4  Interfacing  Requirements  for  the  Computer 
2.4.1  Computer  Output  Ports 

Two  output  ports  must  be  provided  by  the  computer,  the  commend 
port  end  the  dsts  port.  Both  must  be  8  bits  wide,  ectlve  high  et 
normsl  TTL  levels.  All  lines  ere  buffered  In  the  OVDS  end  pre- 
sent  1  'LS  TTL  loed.  S60  ohm  pull  up  resistors  to  -45  volts  ere 
Included  on  these  lines  In  the  OVDS.  These  ports  ere  treeted  by 
the  unit  es  one  16  bit  port.  If  seperete  ports  ere  used  It  Is 
lmportsnt  to  loed  the  dete  port  before  the  commend  port,  which 
hes  the  hendsheking  lines  essocleted  with  It.  Once  dete  hes  been 
set  up  end  letched  on  both  ports  the  COMPUTER  READY  line  should 
go  high.  The  DVDS  will  letch  both  ports  500  ns.  efter  the  rising 
edge  of  this  slgnsl.  COMPUTER  READY  should  be  e  stsnderd  TTL 
level  slgnsl.  At  this  time  the  DEVICE  BUSY  line  will  go  High  to 
lndlcete  thet  the  commend  Is  being  executed.  The  velue  loeded  In¬ 
to  register  0  determines  when  commends  will  be  executed.  In  the 
"V. Drive  Only”  mode  (0),  e  commend  will  be  held  until  the  next 
vertlcel  blanking  lntervel  before  execution.  This  Is  useful  for 
freezing  the  memories  et  the  end  of  e  field.  Other  commends  such 
es  DMA  trensfers  should  tsks  piece  et  high  speed  so  register  0  Is 
put  Into  the  "Anytime”  mode  (1).  Once  the  DVDS  hes  executed  the 
commend  from  the  host  computer,  the  DEVICE  BUSY  line  (TTL)  will 
go  low.  Also  et  this  time  the  BUSY  strobe  will  pulse  for  1 
microsecond.  These  lndlcete  thet  the  computer  mey  send  enother 
commend.  Any  commends  sent  before  this  time  will  be  lost. 


2.4.2  Computer  Input  Port 

One  8  bit  Input  port  to  the  computer  Is  required  to  reed  dete 
from  the  Imege  memory.  This  port  must  be  stsnderd  level,  TTL, 
ectlve  high.  After  e  DMA  READ  commend  he*  been  executed,  the 
pixel  dstum  will  be  pieced  on  the  computer  Input  port.  The 
DEVICE  READY  line  will  Chen  pulse  for  1  microsecond  to  strobe 
this  lnformetlon  Into  the  computer  Input  port.  This  slgnsl  Is  et 
stsnderd  TTL  levels.  Note  thet  the  dete  should  be  strobed  In  on 
the  trelllng  edge  of  the  DEVICE  READY  line. 
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3.0  Analog  Input/Output 


3.1  Description 

This  card  accepts  the  video  input  froa  a  caeera,  filters  the 
signal,  digitises  it,  and  passes  it  to  the  laage  aeaory.  It  also 
recieves  digital  video  froa  various  sources,  converts  it  to 
analog  video,  and  passes  it  through  a  75  oia  driver  to  the  CRT 
aonltor.  In  addition,  coaposite  sync  is  available  for  driving 
the  caaera.  BNC  connectors  for  video  in,  video  out,  and  coap¬ 
osite  sync  arc  located  on  the  back  of  the  DVDS. 

3.2  Analog  to  Digital  section 

This  section  accepts  the  input  video  froa  the  caaera  and 
digitises  it. 


input  lapedance 
input  signal  level 
saaple  rate 
gray  level  resolution 
nuaber  of  bits 
filter  cutoff 


75  ohas 

1  volt  sync  tip  to  white 
5  MHz 

256  levels 
8 

2  MHz,  24  db/octave 


A  white  dip  light  is  provided  on  the  front  panel.  This  will 
coae  on  when  the  video  signal  is  too  high,  causing  the  analog  to 
digital  converter  to  overflow. 


3.3  Digital  to  Analog  section 

Either  of  the  stored  laages  or  the  arithaetlc  unit  aay  be 
selected  for  display.  In  addition  to  the  full  eight  bit  grey 
scale  display,  a  single  bit  plane  of  any  of  these  sources  aay  be 
selected  for  viewing  with  the  bit  plane  eddress  register  (see 
2.2.1).  The  single  bit  aode  output  is  either  full  black  or  full 
white.  Blanking  and  sync  are  added  digitally  so  there  is  no 
adjustaent  of  sync  or  set  up  levels.  The  coaposite  signal  is 
fltered  to  reaova  spurious  high  frequency  coaponents. 


output  lapedance 
output  level 
saaple  rate 
nuaber  of  bits 
filter  cutoff 


75  ohns 

1  V  p-p  terminated 
5  MHz 

8 

2  MHz.  12  db/octave 


3.4  Composite  Sync. 


Coe  po site  aync  ia  uaad  to  drive  the  video  camera.  The  DVDS  uaea 
the  EIA  RS-170  a tender d. 


output  impedance 
level 
polarity 
horizontal  rate 
horizontal  lines 
active  lines 
vertical  rate 
frame  rate 


75  ohms 

4  V  p-p  terminated 

negative  going 

15,734  Be 

525 

480 

60  Hz 

30/ second 


Note  that  because  this  is  a  256  H  by  240  V  aystea,  the  aaae  image 
is  repeated  on  both  fields. 

3.5  Caaera  requireaents 

The  aoat  important  requirement  is  that  the  caaera  sync  to  the 
system.  A  caaera  that  will  lock  to  the  above  RS-170  format  will 
vork.  To  obtain  aaxlata  use  of  the  256  gray  levels,  it  should 
have  a  signal  to  noise  ratio  greater  than  48  dB.  The  digitizing 
rate  is  5  MHz  so  the  caaera  should  have  a  bandwidth  of  at  least 

2.5  MHz. 


3.6  Adjustment  proceedure 

1)  Needed:  small  screw  driver,  dual  trace  oscilloscope, 
Drawing  f  001-018  A/D-D/A  component  layout. 

2)  Remove  the  cover  from  the  machine. 

3)  Connect  the  camera. 

4)  Turn  the  power  on. 

5)  Point  the  caaera  on  a  bright  scene  and  open  the  f-stop 
for  maximum  contrast. 

A/D  Adjustaent 

6)  Set  both  scope  channels  for  lV/dlvlslon. 

7)  Ground  both  probes  and  line  the  traces  on  the  scope 
screen  up  on  the  second  line  froa  the  bottom. 

This  is  0V. 

8)  Referring  to  drawing  #  001-018,  connect  Ch  1  to  TP-1 
and  Ch  2  to  TP- 2.  Unground  thea  and  sync  to  Ch  1. 

Ch  2  should  be  between  6.5V  and  7.0V. 

9)  Cap  the  caaera  lens.  Turn  the  BLACK  LEVEL  ADJUST  pot 
so  that  the  lowest  portion  of  the  active  video  (  not 
sync  )  Just  touches  0V. 
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10)  Uncap  the  caaera  lana  and  turn  the  FULL  SCALE  ADJUST 
pot  ao  that  the  highest  (  brightest  )  portions  of  the 
active  video  Just  touch  the  line  on  Ch*2 .  If  the 
video  goes  above  the  CH-2  reference,  the  white  dip 
light  on  the  front  panel  should  turn  on. 

11)  The  adjustaents  In  steps  9  and  10  are  soaevhat 
Interactive  so  go  back  and  repeat  9  and  10  as  required. 

D/A  Adjustaent 

12)  Either  load  a  full  scale  rap  into  the  reference 
aeaory  or  point  the  caaera  at  a  bright  light  source 
to  create  a  full  scale  signal  for  the  D/A. 

13)  Set  CHI  for  .5V/di vision 

14)  Ground  the  probe  and  put  the  trace  on  the  second  line 
up  froa  the  bottoa. 

15)  Connect  the  probe  to  TP -4 .  Unground  the  scope  en  A.C. 
couple  It. 

16)  Turn  the  D/A  REFERENCE  ADJUST  pot  so  that  the  video 
signal  Is  1  Volt  peak  to  peak  (2  Volts  peak  to  peak  if 
nothing  is  connected  to  the  output,  l.e.  unteralnated) . 

This  coapletes  the  Adjustaent  procedure. 


4.0  Mapping  RAM 


4.1  Description 

The  256  x  8  Mapping  RAM  can  be  loaded  with  an  arbitrary  transfer 
function  to  process  the  laage  stored  In  the  Bit  Plane  Maaory. 
The  output  of  this  RAM  can  be  sent  to  the  analog  Input/output 
card  for  display  or  to  the  DMA  Interface  for  transalssion  to  the 
coaputer.  Note  that  due  to  the  pipeline  delay  of  the  circuitry, 
the  displayed  laage  through  the  Mapping  RAM  is  shifted  to  the 
right  by  one  pixel. 
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4.2  Loading  the  Mapping  RAMa 


The  following  la  the  general  algorithm  for  entering  a  tranafer 
function.  A  dedicated  addreaa  generator  la  included  for  loading. 
The  circuitry  has  an  autolncreaent  feature  ao  it  la  not  aeceaaary 
to  apecify  each  addreaa  when  loading  aucceaalve  location*. 

1.  Set  reglater  224  to  1  -  thia  put*  the  RAMa  into  the 
write  aode. 

2.  Load  reglater  227  with  the  etartlng  addreaa.  Thia  la 
uaually  0. 

3.  To  write  into  the  sapping  RAM,  load  reglater  223  with 
the  dealred  data.  Loading  thia  reglater  initiate*  the 
write  function  into  the  RAM. 

4.  The  addreaa  will  automatically  increaent,  ao  for 
aucceaalve  addreaaea  go  to  atep  3.  To  load  random 
location*  go  to  atep  2. 

3.  When  completed,  load  reglater  224  with  0  to  put  the  RAMa 
into  the  normal  mode. 


5.0  Addreaa  Generator 


3.1  Deacrlptlon 

Thle  card  generatea  the  video  read  and  write  addreaaea  which  are 
then  uaed  by  the  image  memoriea.  It  alao  generatea  the  refreah 
algnal*  for  the  dynamic  RAMa  uaed  in  the  image  mmaory.  The 
addreaalng  mode*  are  fixed. 


6.0  Computer  Interface  Connector  list 


1 

Data  Port 

Bit  0  (LSB) 

[1J 

20 

Coaputer  Ready 

[18] 

2 

Data  Port 

Bit  1 

12) 

21 

Device 

Busy  (Strobe) 

[19] 

3 

Data  Port 

Bit  2 

13] 

22 

GND 

[20] 

4 

Data  Port 

Bit  3 

(*) 

23 

Device 

Busy  (Strobe) Inv. 

[21] 

5 

Data  Port 

Bit  4 

15] 

24 

Output 

Port  Bit  0  (LSB) 

[22] 

6 

Data  Port 

Bit  5 

16] 

25 

Output 

Port  Bit  1 

[23] 

7 

Data  Port 

Bit  6 

[7] 

26 

Output 

Port  Bit  2 

[24] 

8 

Data  Port 

Bit  7 

f8] 

27 

Output 

Port  Bit  3 

[25] 

9 

GND 

t»] 

28 

Output 

Port  Bit  4 

[26] 

10 

Address  Port 

Bit  0  (LSB) 

[10] 

29 

Output 

Port  Bit  5 

[27] 

11 

Address  Port 

Bit  1 

in) 

30 

Output 

Port  Bit  6 

[28] 

12 

Address  Port 

Bit  2 

[12] 

31 

Output 

Port  Bit  7 

[29] 

13 

Address  Port 

Bit  3 

[13] 

32 

Device 

Ready 

[30] 

14 

Address  Port 

Bit  4 

[14] 

33 

Not  Connected 

[31] 

15 

Address  Port 

Bit  5 

[15] 

34 

Device 

Ready  (Inv.) 

[32] 

16 

Address  Port 

Bit  6 

[16] 

35 

Device 

Busy 

[33] 

17 

Address  Port 

Bit  7 

[17] 

36 

Device 

Busy  (Inv.) 

[34] 

18  37 

19 

Notes:  All  signal*  are  TTL  levels  active  high  except  those 
■arked  (Inv.)  which  are  active  low.  Ntabering  is  for  the  37 
pin  D  type  subalnlature  connector  on  the  rear  panel. 

Nmbers  in  brackets  are  for  PI,  the  34  pin  cable  socket  on 
Interface  board  #10001  (drawings  001-002,  001-003,  001-004 
001-005  and  001-019). 
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1.0  System  Introduction 


1.1  Controller 

The  controller  card  contains  the  master  clock  oscillator,  PROMs 
with  all  the  neecessary  timing  and  sync  signals,  and  the  circuit¬ 
ry  needed  for  controlling  the  dynamic  RAMs  used  in  the  Image 
memories. 

1.2  Image  Memories 

There  are  two  memory  cards,  the  reference  and  the  bit  plane. 
Each  memory  card  uses  16K  dynamic  RAMs  arranged  256  H  by  240  V  by 
8  deep.  They  run  in  real  time,  synchronous  with  the  video  dis¬ 
play.  Since  the  RAMs  do  not  access  fast  enough,  they  are  multi¬ 
plexed  4  deep.  Por  a  read  cycle,  four  pixels  are  addressed  in 
parallel  and  are  serially  shifted  out.  The  write  cycle  works 
similarly  but  in  the  opposite  direction. 

1.2.1  Reference  Memory 

The  reference  memory  stores  an  eight  bit  gray  scale  image.  It 
may  be  filled  with  data  from  the  video  input  or  directly  written 
into  through  the  DMA  interface  from  the  host  computer.  It  may 
be  displayed  through  the  video  output  or  unloaded  through  the  DMA 
interface  to  the  computer. 

1.2.2  Bit  Plane  Memory 

The  bit  plane  memory  is  configured  as  eight  256  X  240  planes. 
Input  data  to  any  of  the  planes  comes  from  a  magnitude 
comparator.  This  compares  the  live  video  input  with  the  output 
of  the  reference  memory  on  a  pixel  by  pixel  basis.  If  the  input 
pixel  is  less  than  or  equal  to  the  reference  pixel,  a  zero  is 
stored.  If  it  is  greater,  a  one  is  stored.  Only  one  plane  may 
be  written  into  at  a  time.  This  memory  may  be  displayed  through 
the  video  output  or  unloaded  to  the  computer  through  the  DMA 
interface.  The  arithmetic  unit  may  be  placed  between  the  bit 
plane  memory  and  either  of  these  output  channels  (see  1.6). 

1.3  Address  Generator 

This  card  generates  the  addresses  used  to  read  from  and  write  to 
the  memory  in  the  video  modes.  It  also  generates  the  refresh 
addresses  for  the  dynamic  RAMs.  Its  addressing  modes  are  fixed. 

1.4  Analog  Input/Output 

This  card  is  the  interface  from  the  camera  to  the  frame  store  and 
from  the  frame  store  to  the  CRT  monitor.  The  video  input  from 
the  camera  is  amplified  and  digitized  at  5  MHz  to  8  bit  gray 
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scale  resolution.  The  digitized  video  is  then  put  onto  the 
system  bus  destined  for  the  memory.  There  is  a  selector  that 
allows  the  viewing  of  any  of  three  sources  and  either  the  entire 
gray  scale  or  any  of  the  eight  individual  planes.  The 
translation  from  digital  to  video  is  done  at  5  MHz  in  an  8  bit 
digital  to  analog  converter. 

1.5  Arithmetic  Unit 

The  arithmetic  unit  (AU)  is  used  to  process  the  output  of  the  bit 
plane  memory.  It  can  perform  both  arithmetic  and  logic 
functions.  RAMs  loaded  from  the  computer  are  Included  for 
mapping  both  operands. 

1.6  Computer  Interface 

The  computer  Interface  card  is  the  link  between  the  frame  store 
and  the  host  computer.  It  uses  an  8  bit  bidirectional  data  bus 
and  an  8  bit  address  bus.  This  card  allows  the  user  to  program 
the  frame  store  into  its  various  modes.  It  also  has  its  own 
address  generator  for  the  image  memory  that  allows  bidirectional 
Direct  Memory  Access  (DMA). 


1.7  Case  and  Electrical 

The  Frame  store  is  packaged  in  an  S-100  type  computer  case.  The 
motherboard  has  been  modified  by  adding  terminating  resistors. 
It  uses  standard  S-100  wire  wrap  cards.  There  is  an  Internal 
power  supply  that  uses  120  VAC  +-  10Z.  The  on/off  switch  has  a 
key  lock  and  is  located,  along  with  the  indicator  light,  on  the 
front  panel.  Due  to  several  internal  reset  circuits,  after 
turning  the  unit  off,  the  user  should  wait  10  seconds  before 
turning  the  unit  back  on.  All  video  connections  are  made  with 
BNC  connectors  located  on  the  back  panel.  The  computer  Interface 
is  via  a  37  pin  submlnlature  D  type  connector  on  the  back  panel. 
The  fuseholder  is  also  on  the  back  panel.  The  fuse  is  a  3  amp 
slo-blo  type.  Additionally,  the  three  unregulated  voltages  on 
the  bus  are  individually  fused.  The  +8  volt  line  uses  a  7  amp 
fuse  while  the  +18  and  -18  lines  use  2  amp  fuses. 


2.0  Computer  Interface 


2.1  Description 

The  DVDS  is  designed  to  be  a  peripheral  to  a  host  computer.  Its 
modes  of  operation  are  programmable  and  once  programmed  it  needs 
no  further  intervention  to  operate.  Programming  is  done  by 
loading  registers  in  the  DVDS  that  control  specific  functions. 
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Two  8  bit  output  ports  oust  be  available  from  the  computer.  One, 
the  command  port ,  specifies  which  register  Is  to  be  loaded.  The 
other,  the  output  data  port,  carries  the  data  to  be  loaded.  See 
paragraph  2.2  for  the  list  of  registers.  These  registers  are 
write  only,  that  is  they  cannot  be  read.  Because  the  DVDS  uses 
the  handshaking  line  on  the  command  port  to  Initiate  register 
loading,  the  data  port  must  be  loaded  before  the  command  port. 
Since  the  registers  may  come  up  at  random  on  power  up,  they 
should  be  initialized  to  a  known  condition  before  the  DVDS  la 
used.  This  is  called  booting  the  system.  See  the  program 
listing  "BOOT”.  An  output  port  from  the  DVDS  is  used  to  unload 
the  image  memory  to  the  boat  computer. 


2.2  List  of  Registers,  Addresses,  and  Modes 
2.2.1  Computer  Interface 

r-') 

Address 

0  Bit  0:0-  register  loading  during  vertical  drive  only 

1  -  register  loading  during  entire  frame 
All  other  bits  are  "don't  care" 

1  Bit  0:0-  Reference  Memory  Live 

1  -  Reference  Memory  Frozen 

Bit  1:0-  Bit  Plane  Memory  Frozen  ^  _ 

1  *  Bit  Plane  Memory  Live 
Bit  2  :  0  -  DMA  Output  Mode 

1  -  DMA  Input  Mode 

Bit  3  :  0  -  Normal  (Video  IN,  OCT)  Mode 

1  -  DMA  Mode 

All  other  bits  are  "don't  care" 

2  DMA  Horizontal  Address  Register  (0-255) 

3  DMA  Vertical  Address  Register  (0-255) 

4  IMA  enable.  During  a  read  this  is  a  flag,  that  is,  it 
initiates  the  read  function,  during  write  it  contains 
input  pixel  data  0-255. 

5  Bit  0:0-  Display  Reference  Memory 

1  -  Display  Bit  Plane  Memory 
Bit  1  :  0  -  Direct  Memory  Display 

1  -  Display  Bit  Plane  Memory  through  Arithmetic 
Unit  (A.U.) 

Bit  2:0-  Display  Full  grey  scale  c  t  ^ 

1  -  Display  Single  Bit  of  selected  Memory 
Note:  The  Single  Bit  display  of  either  the 

Reference  Memory,  the  Bit  Plane  Memory  or  the 
Bit  Plane  through  the  A.U.  may  be  chosen.  The 
bit  is  chosen  by  Register  6. 


Bit  4  :  0  •  A/D  on  Input  Data  Bus 

1  m  Command  Data  Bus  on  Input  Data  Bus 
Note:  This  allows  Data  from  the  Host  Computer 

to  be  placed  on  the  Bit  Plane  Memory  Input 
rather  than  the  Input  Video.  This  la  useful 
for  testing. 

All  other  Bits  are  "don't  care" 


6  Bits  0-2  :  Writing  into  the  Bit  Plane  Memory  occurs  one 
plane  at  a  time.  These  three  bits  determine 
which  plane  (0-7)  is  active  (live). 

For  the  display  of  a  single  bit  of  one  of  the 
output  sources  (Register  S  Bit  ,3  High)  register 
6  selects  the  bit  to  be  displayed. v  , 

All  other  Bits  are  "don't  care" 

•  •»  * 

2.2.2  Arithmetic  Unit 


Address 


224 


Bit  0:0-  Normal  Operation 
1  -  Load  Masking  RAMs 
Bits  1-4:  "Don't  care" 

Bits  5-7:  Arithmetic  Function 
000  -  CLEAR 


-r  bV 
i  c  <f6 


cc  /“?2 


001  -  Horizontal  Line  Number  Minus  Bit  Plane 
">*  7t n  Memory 

010*-  Bit  Plane  Memory  Minus  Horizontal  Number 
(Olid-  Bit  Plane  Memory  Plus  Horizontal  Line*' 
1009-  Bit  Plane  Memory  EX0R  Horizontal  Line 
<  1019-  Bit  Plane  Memory  OR  Horizontal  Line 
<>11(V-  Bit  Plane  Memory  AND  Horizontal  Line 
<111*-  PRESET 


225  write  enable  for  Image  Mask  RAMs  (0-255) 

226  write  enable  for  Line  Mask  RAMs  (0-255) 

227  write  address  register  (0-255) 

2.2.3  Operating  Modes 

Mode  Register  1  Register  5 

1.  Both  Memories  Frozen,  1  0 

Reference  Memory  Displayed. 

2.  Reference  Memory  Live,  Bit  0  0 

Plane  Memory  Frozen,  Reference 

Memory  Displayed. 
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3.  Both  Memories  Pro ten,  1  1 

Bit  Plane  Memory  Displayed. 

4.  Bit  Plane  Memory  Live,  Reference  3  1 

Memory  Frozen,  Bit  Plane  Memory 
Displayed,  Register  6  contains  the  it 
Plane  Number  which  is  Live  (0-7). 

5.  Both  Memories  Frozen,  1  3 

Display  Bit  Plane  Memory  through 
Arithmetic  Unit. 

6.  Bit  Plane  Memory  Live,  3  3 

Display  Bit  Plane  Memory  through 
Arithmetic  Unit.  Register  6 
contains  the  Plane  Number  which 
is  Live  (0-7).  Register  224 
contains  A.U.  function. 

7.  Both  Memories  Frozen,  Reference 
Memory  Displayed,  DMA  Urite  into 
Reference  Memory. 

8.  Both  Memories  Frozen,  Reference 
Memory  Displayed,  DMA  read  from 
Reference  Memory. 

9.  Both  Memories  Frozen,  Bmferemee 
Memory  Displayed,  DMA  read  from 
Bit  Plane  Memory. 

10.  Both  Memories  Frozen,  Reference  12  3 

Memory  Displayed,  DMA  read  from 

Bit  Plane  Memory  through  the 
Arithmetic  Unit.  Register  224 
contains  A.U.  function. 

Note:  In  any  of  these  modes  a  single  plane  of  the  output 

source  may  be  viewed  by  adding  4  to  the  number  in  Register  5  and 
loading  Register  6  with  the  Plane  number  (0-7) 

2.3  Direct  Memory  Access  (DMA) 

2.3.1  Introduction 

The  DVDS  allows  the  host  computer  to  directly  access  the 
memories.  Frozen  Images  from  either  memory  or  from  the  bit  plane 
memory  through  the  A.U.  may  be  unloaded  to  the  computer  via  the 
high  speed  DMA  Interface.  Additionally,  the  reference  memory  can 
be  directly  loaded  from  the  computer.  The  transfer  rate  in 
either  direction  can  exceed  200K  bytes/second.  A  DMA  address 
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generator  Is  used  for  both  read  and  write  operations.  Pixel 
coordinates  may  be  loaded  into  the  horizontal  and  vertical 
address  registers  for  randoa  access  of  the  image  memory. 
Additionally,  the  DMA  address  generator  will  autoincrement  after 
each  read  or  write  operation.  This  allows  a  starting  address  to 
he-  specified  and  then  the  circuit  will  provide  sequential 
addresses  Itself.  This  speeds  the  data  transfer  rate 
substantially.  The  top  left  pixel  is  at  address  OH,  0  V.  The 
top  right  pixel  is  at  address  255  H,  0  V.  The  bottom  left  pixel 
is  at  address  0  H,  240  V.  The  bottoa  right  pixel  is  at  location 
255  H,  240  V.  The  addresses  will  autolncreaent  froa  the  last 
address  loaded  into  the  horizontal  address  register  until  255  is 
reached.  Then  the  vertical  address  will  increment  by  one  and  the 
horizontal  address  overflows  back  to  0. 

2.3.2  DMA  Read 

The  algorltha  for  performing  DMA  read  is  outlined  below.  See 
also  the  prograa  listings  for  REF. RAMP  and  M0VL0D.0BJ0 

*  >  1.  load  register  0  with  1  -  allows  DMA  access  during  entire 

frame. 

I 

2.  load  register  1  with  1/  -  this  freezes  the  images  and 
prepares  for  a  DMA  read  operation. 

3.  load  register  5  with -the  proper^sode  for  the  desired 
output  source  (0  «  Reference  Memory,  1  -  Bit  Plane 
Memory.  3  -  Bit  Plane  Memory  through  the  ALU)  j . 

4.  load  register  2  (DMA  horizontal  address  register)  with 
H  pixel  coordinate. 

5.  load  register  3  (DMA  vertical  address  register)  with  V 
pixel  coordinate. 

6.  load  register  4  to  assert  the  DMA  read  at  the  previously 
specified  address.  The  pixel  will  be  sent  to  the  com¬ 
puter  via  the  DVDS  output  port. 

7.  for  random  access  go  to  4.  for  sequential  access  go  to 
6.  and  repeat  until  the  desired  number  of  pixels  are 
recieved. 

8.  loed  register  1  with  1.  -  this  returns  the  system  to  its 
normal  state  and  keeps  the  images  frozen. 


Note:  The  handshaking  lines  should  be  checked  after  each 


2.3.3  DMA  Write  (Reference  Memory  Only) 


The  proceedure  for  performing  DMA  write  is  similar  to  the  DMA 
read.  Also  see  the  program  listing  REF. READ  and  MOVLOD.OBJO 

1.  load  register  0  with  1  -  this  allows  DMA  access  during 
the  entire  frame. 

2.  load  register  1  with  8.  -  this  freezes  the  image  and 
prepares  the  unit  for  write. 

3.  Load  register  5  with  0  -  this  displays  the  reference 
memory 

4.  load  register  2  with  the  pixel  H  coordinate. 

5.  load  register  3  with  the  pixel  V  coordinate. 

6.  load  register  4  with  the  pixel  to  be  written  into  the 

image  memory. 

7.  for  random  access  go  to  4.  for  sequential  access  go  to 
6.  and  repeat  until  the  desired  number  of  pixels  are 
transferred. 

8.  load  register  1  with  1  -  this  returns  the  unit  to  nor¬ 
mal  operation  with  the  images  frozen. 

Note:  The  handshaking  line  should  be  checked  after  each 

step 


2.4  Interfacing  Requirements  for  the  Computer 
2.4.1  Computer  Output  Ports 

Two  output  ports  must  be  provided  by  the  computer,  the  command 
port  and  the  data  port.  Both  must  be  8  bits  wide,  active  high  at 
normal  TTL  levels.  All  lines  are  buffered  in  the  DVDS  and 
present  l'LS  TTL  load.  470  ohm  pull  up  resistors  to  +5  volts  are 
included  on  these  lines  in  the  DVDS.  These  ports  are  treated  by 
the  unit  as  one  16  bit  port.  If  seperate  ports  are  used  it  is 
important  to  load  the  data  port  before  the  command  port,  which 
has  the  handshaking  lines  associated  with  it.  Once  data  has  been 
set  up  and  latched  on  both  ports  the  COMPUTER  READY  line  should 
go  high.  The  DVDS  will  latch  both  ports  soon  after  the  rising 
edge  of  this  signal.  COMPUTER  READY  should  be  a  standard  TTL 
level  signal.  At  this  time  the  DEVICE  BUSY  line  will  go  High  to 
indicate  that  the  command  is  being  executed.  The  value  loaded  in¬ 
to  register  0  determines  when  commands  will  be  executed.  In  the 
"V. Drive  Only"  mode  (0),  a  command  will  be  held  until  the  next 
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vertical  blanking  Interval  before  execution.  Thia  is  useful  for 
freezing  the  memories  at  the  end  of  a  field.  Other  commands  such 
as  DMA  transfers  should  take  place  at  high  speed  so  register  0  is 
put  into  the  "Anytime"  mode  (1).  Once  the  OVDS  has  executed  the 
coamand  froa  the  host  computer,  the  DEVICE  BUSY  line  (TTL)  will 
go  low.  Also  at  this  time  the  BUSY  strobe  will  pulse  for  1 
microsecond.  These  Indicate  that  the  coaputer  aay  send  another 
coamand.  Any  commands  sent  before  this  tiae  will  be  lost. 

2. A. 2  Coaputer  Input  Port 

One  8  bit  input  port  to  the  coaputer  is  required  to  read  data 
from  the  image  aeaory.  This  port  aust  be  standard  level,  TTL, 
active  high.  After  a  DMA  READ  coamand  has  been  executed,  the 
pixel  datum  will  be  placed  on  the  coaputer  input  port.  The 
DEVICE  READY  line  will  then  pulse  for  1  microsecond  to  strobe 
this  information  into  the  coaputer  input  port.  This  signal  is  at 
standard  TTL  levels.  Mote  that  the  data  should  be  strobed  in  on 
the  trailing  edge  of  the  DEVICE  READY  line. 


3.0  Analog  Input/Output 


3.1  Deacrlption 

This  card  accepts  the  video  input  from  a  camera,  digitizes  it, 
and  passes  it  to  the  image  memory.  It  also  recleves  digital 
video  from  various  sources,  selects  the  source,  converts  it  to 
analog  video,  and  passes  it  to  the  CRT  monitor.  In  addition, 
composite  sync  is  available  for  driving  the  camera.  BNC  con¬ 
nectors  for  video  in,  video  out,  and  composite  sync  are  located 
on  the  back  of  the  DVDS. 

3.2  Analog  to  Digital  section 

This  section  accepts  the  input  video  froa  the  camera  and 
digitizes  it. 

input  impedance  75  ohas 

input  signal  level  1  volt  sync  tip  to  white 

aaaple  rate  5  MHz 

gray  level  resolution  256  levels 

number  of  bits  8 


A  white  clip  light  is  provided  on  the  front  panel.  This  will 
coae  on  when  the  video  signal  is  too  high,  causing  the  analog  to 
digital  converter  to  overflow. 


3.3  Digital  to  Analog  section 


Either  of  the  stored  images  or  the  arithmetic  unit  may  be 
selected  for  display.  In  addition  to  the  full  eight  bit  grey 
scale  display,  a  single  bit  plane  of  any  of  these  sources  may  be 
selected  for  viewing  with  the  bit  plane  address  register  (see 
2.2.1).  The  single  bit  mode  output  is  either  full  black  or  full 
white.  Blanking  and  sync  are  added  digitally  so  there  is  no 
adjustment  of  sync  or  set  up  levels. 


output  impedance 
output  level 
sample  rate 
number  of  bits 


75  ohms 

1  V  p-p  terminated 
5  MHz 
8 


3.4  Composite  Sync 


Composite  sync  is  used  to  drive  the  video  camera.  The  DVDS  uses 
the  EIA  RS-170  standard. 


output  lmpedence 
level 
polarity 
horizontal  rate 
horizontal  lines 
active  lines 
vertical  rate 
frame  rate 

Note  that  because  this  is  a  256 

is  repeated  on  both  fields. 


75  ohms 

4  V  p-p  terminated 

negative  going 

15,734  Hz 

525 

480 

60  Hz 

30/ second 

H  by  240  V  system,  the  same  image 


3.5  Camera  requirements 

The  most  Important  requirement  is  that  the  camera  sync  to  the 
system.  A  camera  that  will  lock  to  the  above  RS-170  format  will 
work.  To  obtain  maximum  use  of  the  256  gray  levels,  it  should 
have  a  signal  to  noise  ratio  greater  than  48  dB.  The  digitizing 
rate  is  5  MHz  so  the  camera  should  have  a  bandwidth  of  at  least 

2.5  MHz. 


3.6  Adjustment  proceedure 

1)  Needed:  small  screw  driver,  dual  trace  oscilloscope. 
Drawing  #F-026  A/D-D/A  component  layout. 

2)  Remove  the  cover  from  the  machine. 

3)  Connect  the  camera. 

4)  Turn  the  power  on. 

5)  Point  the  camera  on  a  bright  scene  and  open  the  f-stop 
for  maximum  contrast. 


A/D  Adjustment 

6)  Set  both  scope  chsnnels  for  lV/division. 

7)  Ground  both  probes  and  line  the  traces  up  on  the  second 
line  from  the  bottom.  This  is  OV. 

8)  Referring  to  drawing  #  F-026,  connect  Ch  1  to  TP-1  and 
Ch  2  to  TP-2.  Unground  them  and  sync  to  Ch  1.  Ch  2 
should  be  at  approximately  +6.7V. 

9)  Cap  the  camera  lens.  Turn  the  CLAMP  LEVEL  ADJUST  pot 
so  that  the  lowest  portion  of  the  active  video  (  no 
sync  )  just  touches  OV. 

10)  Uncap  the  camera  lens  and  turn  the  FULL  SCALE  ADJUST 
pot  so  that  the  highest  (  brightest  )  portions  of  the 
active  video  just  touch  the  6.7V  line  on  Ch-2.  If  the 
video  goes  above  the  CH-2  reference ,  the  white  clip  light 
on  the  front  panel  should  turn  on. 

11)  The  adjustments  in  steps  9  and  10  are  somewhat  interactive 
so  go  back  and  repeat  9  and  10  as  required. 

D/A  Adjustment 

12)  Set  CHI  for  .5V/division 

13)  Ground  the  probe  and  put  the  trace  on  the  scope 
centerline. 

14)  Connect  the  probe  to  TP-3.  Unground  the  scope. 

15)  Turn  the  D/A  REFERENCE  ADJUST  pot  so  the  scope  reads 
-  1.0  Volts. 

16)  Connect  the  probe  to  TP-4.  AC  couple  the  scope. 

17)  Turn  the  output  LEVEL  ADJUST  pot  so  that  the  video 
signal  is  1  Volt  peak  to  peak  (2  Volts  peak  to  peak  if 
nothing  is  connected  to  the  output,  l.e.  unterminated). 

This  completes  the  Adjustment  procedure. 


4.0  Arithmetic  Unit 


4.1  Description 

The  heart  of  the  unit  is  eight  bit  ALU  (2  X  74S381).  The  two 
operands  to  the  ALU  are  an  output  pixel  from  the  bit  plane  memory 
and  the  vertical  line  (address)  of  that  pixel.  The  pixel  is 
first  passed  through  a  PROM  that  is  programmed  to  perform  a  gray 
code  to  binary  conversion.  It  is  then  passed  through  a  256  X  8 
mapping  RAM  to  perform  an  arbitrary  transfer  function.  The  other 
operand,  the  line  number,  is  also  passed  through  an  identical 
mapping  RAM  before  going  to  the  ALU.  The  output  of  the 
arithmetic  unit  can  be  sent  to  the  analog  input/output  card  for 
display  or  to  the  DMA  interface  and  then  to  the  computer.  Note 
that  due  to  the  pipeline  delay  of  the  processor,  the  displayed 
image  through  the  AU  is  shifted  to  the  right  by  one  pixel. 


4.1*1  Functions 


The  ALU  can  perform  the  following  functions: 


1)  clear 

2)  line  -  memory 

3)  memory  -  line 

4)  memory  +  line 

5)  memory  EXOR  line 

6)  memory  OR  line 

7)  memory  AND  line 

8)  preset 

Note  that  no  provision  for  numeric  overflow  or  underflow  is  made. 
See  section  2.2.2  for  detailed  programming  information. 

4.2  Loading  the  Mapping  RAMs  ^ 


The  following  is  the  general  algorithm  for  entering  a  transfer 
function.  The  RAMs  use  a  common  address  generator  for  loading. 
The  circuitry  has  an  autoincrement  feature  so  it  is  not  necessary 
to  specify  each  address  when  loading  successive  locations. 


1.  Set  register  224 
write  mode. 


m: 


M 


this  puts  the  RAMs  into  the 


2.  Load  register  227  with  the  starting  address.  This  is 
usually  0. 

3.  To  write  into  the  image  mapping  RAM,  load  register  225 
with  the  desired  data.  To  write  into  the  line  mapping 
RAM,  load  register  226  with  the  desired  data.  Loading 
these  registers  performs  the  write  function  in  the  RAM. 

4.  The  address  will  automatically  increment,  so  for 
successive  addresses  go  to  step  3.  To  load  random 
locations  go  to  step  2. 

5.  When  completed,  load  register  224  with  a  number 
representing  the  desired  ALU  function. 


5.0  Address  Generator 


5.1  Description 

This  card  generates  the  video  read  and  write  addresses  which  are 
then  used  by  the  image  memories.  It  also  generates  the  refresh 
signals  for  the  dynamic  RAMs  used  in  the  image  memory.  The 
addressing  modes  are  fixed. 


y/V/  v  /  /„ 


i 


6.0  Connector  list 


6.1 

Computer  Interface 

-  PI  ■ 

on  L(N) -011  to 

PI  on  P-035 

1 

Data  Port 

Bit  0 

(LSB) 

[1] 

20 

Computer  Ready 

[18] 

2 

Data  Port 

Bit  1 

[21 

21 

Device  Busy  (Strobe) 

[19] 

3 

Data  Port 

Bit  2 

[3] 

22 

GND 

[20] 

4 

Data  Port 

Bit  3 

[M 

23 

Device  Busy  (Strobe) Inv. 

[21] 

5 

Data  Port 

Bit  4 

[5] 

24 

Output  Port  Bit  0  (LSB) 

[22] 

6 

Data  Port 

Bit  5 

[6] 

25 

Output  Port  Bit  1 

[23] 

7 

Data  Port 

Bit  6 

[71 

26 

Output  Port  Bit  2 

[24] 

8 

Data  Port 

Bit  7 

[81 

27 

Output  Port  Bit  3 

[25] 

9 

GND 

[9J 

28 

Output  Port  Bit  4 

[26] 

10 

Address  Port 

Bit  0 

(LSB) 

[101 

29 

Output  Port  Bit  5 

[27] 

11 

Address  Port 

Bit  1 

[HI 

30 

Output  Port  Bit  6 

[28] 

12 

Address  Port 

Bit  2 

(12} 

31 

Output  Port  Bit  7 

[29] 

13 

Address  Port 

Bit  3 

[131 

32 

Device  Ready 

[30] 

14 

Address  Port 

Bit  4 

[14] 

33 

[31] 

15 

Address  Port 

Bit  5 

[15} 

34 

Device  Ready  (Inv.) 

[32] 

16 

Addresi  Port 

Bit  6 

[16] 

35 

Device  Busy 

[33] 

17 

Address  Port 

Bit  7 

[  17  J 

36 

Device  Busy  (Inv.) 

[34] 

Notes:  All  signals  are  TTL  levels  active  high  except  those 
marked  (Inv.)  which  are  active  low.  Numbering  Is  for  the  37 
pin  D  type  subminiature  connector  on  the  rear  panel. 

Numbers  in  brackets  are  for  the  34  pin  cable  socket  on 
L(N)  -Oil  and  P-035 


a.'  '  .‘k  .**  ‘ '  *  ■  ■  ’  *  .*  ‘  J  a.V.  V-  a'-..*'-  ■’  .1  -  a’  1   ■*  \\  a'.  : v'a  a~-  a’  -  a' 


15 


6.2  Arithmetic  Unit  Connector  -  P2  on  L(N)-011  to  PI  on  N-028 


1 

DMA 

Vertical 

Address 

0  (LSB) 

11 

ALU 

Output 

Port 

Bit 

2 

2 

DMA 

Vertical 

Address 

0 

12 

ALU 

Output 

Port 

Bit 

3 

3 

DMA 

Vertical 

Address 

2 

13 

ALU 

Output 

Port 

Bit 

4 

4 

DMA 

Vertical 

Address 

3 

14 

ALU 

Output 

Port 

Bit 

5 

5 

DMA 

Vertical 

Address 

4 

15 

ALU 

Output 

Port 

Bit 

6 

6 

DMA 

Vertical 

Address 

5 

16 

ALU 

Output 

Port 

Bit 

7 

7 

DMA 

Vertical 

Address 

6 

17 

DMA 

ALU  LATCH  ENABLE 

8 

DMA 

Vertical 

Address 

7 

18 

ALU 

LATCH  i 

CLOCK 

9 

ALU 

Output  Port  Bit  C 

)  (LSB) 

19 

Output  Port  Select 

(OPS) 

10 

ALU 

Output  Port  Bit  1 

20- 

-34  Spare 

6.3  Address  Connector  -  P2  on  N-028  to  PI  on  L-007 


1 

Read 

Vertical 

Address 

0  (LSB) 

6 

Read 

Vertical 

Address 

2 

Read 

Vertical 

Address 

1 

7 

Read 

Vertical 

Address 

3 

Read 

Vertical 

Address 

2 

8 

Read 

Vertical 

Address 

4 

Read 

Vertical 

Address 

3 

9- 

■16  Spare 

5 

Read 

Vertical 

Address 

4 

7.0  Apple  interface 

7 . 1  Introduction 

The  Apple  Interface  contains  the  hardware  necessary  to  interface 
the  DVDS  to  an  Apple  II  microcomputer.  It  provides  for  two  8 
bit  output  ports,  one  8  bit  input  port  and  all  necessary 
handshaking  lines.  It  normally  resides  in  slot  #4  of  the  Apple's 
peripheral  bus. 

7.2  Output  Ports 

The  output  ports  are  memory  mapped  locations.  The  first  port 
contains  the  address  of  the  DVDS  register  to  be  loaded.  This  is 
called  the  Command  Address  Port.  When  the  card  is  in  slot  #4  its 
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address  is  49356  (COCC  BEX).  The  second  port  sends  the  data  to 
be  loaded  Into  the  addressed  register.  This  Is  called  the 
Command  Data  Port.  With  the  card  In  slot  #4  Its  address  Is  49357 
(COCD  Hex). 

7.3  Input  Ports 

The  Input  port  Is  used  to  receive  DMA.  data  from  the  DVDS.  In 
slot  #4  Its  address  is  49344  (C0C1  HEX).  Data  is  strobed  Into 
this  memory  mapped  register  by  the  DEVICE  READY  (Inv.)  line  from 
the  DVDS.  It  then  can  be  read  by  the  computer. 

7.4  Handshaking  Lines 

A  single  COMPUTER  READY  line  is  provided  for  both  output  ports. 
The  decoded  write  pulse  for  the  Command  Address  Port  also  serves 
as  the  COMPUTER  READY  line.  Thus  the  Command  Data  Port  is  loaded 
first  followed  by  the  Command  Address  Port.  Loading  this 
register  signals  the  DVDS  that  a  command  must  be  executed. 

The  Apple  Interface  uses  two  handshaking  lines  provided  by  the 
DVDS,  DEVICE  BUSY  and  DEVICE  READY  STOBE  (Inv.).  Both  are  mapped 
into  memory  location  49345  (C0C1  HEX) ,  which  is  the  status 
register.  DEVICE  BUSY  goes  active  (high)  upon  activation  of  the 
COMPUTER  READY  line.  It  remains  high  until  the  DVDS  has 
completed  execution  of  the  command.  It  then  goes  low.  DEVICE 
BUSY  is  bit  1  of  the  status  register. 

When  the  DVDS  performs  a  DMA  Read  command  it  leaves  the  data  in 
the  input  register  and  sets  (forces  high)  a  flag  connected  to  bit 
0  of  the  status  register.  The  Apple  may  monitor  this  bit  to  wait 
until  the  DMA  transfer  is  complete.  Reading  the  input  register 
resets  this  bit  to  ready  it  for  the  next  transfer. 
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2 .  This  work  is  the  product  of  United  States 
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to  U.S.  copyright. 


PIPE  (Pipelined  Image  Processing  Engine) ,  is  an 
experimental/  multi-stage,  multi -pipe lined  image  processing 
device.  It  can  acquire  images  from  a  variety  of  sources,  such  as 
analog  or  digital  television  cameras,  ranging  devices,  and 
conformal  mapping  arrays.  It  can  process  sequences  of  images  in 
real  time,  through  a  series  of  local  neighborhood  and  point 
operations,  under  the  control  of  a  host  device.  Its  output  can 
be  configured  for  monitors,  robot  vision  systems,  iconic  to 
symbolic  mapping  devices,  and  image  processing  computers.  In 
addition  to  a  forward  flow  of  images  through  successive  stages  of 
operations  as  in  a  traditional  pipeline,  other  paths  between  the 
stages  of  the  device  permit  concurrent,  interacting  pipelining  of 
image  flow  in  other  directions.  In  particular,  recursive  paths 
returning  images  into  each  stage,  and  feedback  of  the  results  of 
operations  from  each  stage  to  the  preceding  stage  are  supported. 
The  architecture  facilitates  a  variety  of  functions  such  as 
relaxation  and  interactions  of  images  over  time.  Numerous 
operations  are  supported;  within  each  stage  these  include 
arithmetic  and  Boolean  neighborhood  and  point  operations  on 
images.  Between-stage  operations  on  each  pixel  include 
thresholding,  Boolean  and  arithmetic  operations,  functional 
mappings,  and  a  variety  of  functions  for  combining  pixel  data 
converging  via  the  multiple  pipelined  image  paths.  The  device 
also  implements  alternative  processing  modes,  including  "MIMD" 
operations  specific  to  regions  of  interest  defined  by  the  host 
device  or  by  previous  operations  on  the  image,  and  variable 
resolution  pyramid  operations. 


Requirements  of  Robot  Vision. 


A  robot  vision  system  has  requirements  which  differ  in  some 
respects  from  those  of  other  types  of  machine  vision.  Since  the 
function  of  a  robot  is  to  perform  physical  actions  in  space  and 
time  similar  to  those  performed  by  humans,  it  is  not  surprising 
that  these  requirements  are  similar  to  those  imposed  on 
biological  vision  systems.  In  some  machine  vision  applications, 
such  as  interpretation  of  images  from  diagnostic  or  satellite 
equipment,  each  new  picture  is  a  new  problem;  in  robot  vision, 
like  biological  vision,  the  input  is  a  stream  of  images  in  which 
each  frame  differs  from  the  last  by  some  small  displacement  of 
the  camera  or  of  the  objects  in  the  scene.  In  many  machine  vision 
tasks  the  import  amt  function  is  to  recognize  certain  kinds  of 
objects,  but  in  robot  vision,  again  like  biological  vision,  most 
of  the  vision  system's  time  is  spent  providing  information  about 
the  position  of  things  in  space  for  guidance  and  servo ing.  This 
is  true  after  objects  have  already  been  recognized,  before  they 
are  recognized,  amd  even  if  they  cannot  be  recognized  by  the 
system.  The  robot  must  under stamd  the  spatial  occupancy  of  its 
environment  and  its  own  relation  to  it  in  order  to  avoid  striking 
surfaces,  whether  or  not  those  surfaces  are  part  of  recognized 
objects.  Ideally  the  robot  vision  system  should  provide  a 
description  of  the  geometric  properties  of  unrecognized  objects 
sufficient  to  permit  the  robot  to  manipulate  them;  for  example, 
to  remove  a  foreign  object  from  the  workspace.  Most  importantly, 
a  robot  vision  system  shares  with  the  biological  system  the 
necessity  of  synchronization  with  real-time  events. 

From  these  considerations,  we  can  derive  some  important 
properties  of  a  robot  vision  system.  Because  the  robot's  function 
is  oriented  towards  actions  and  events  which  extend  over  time, 
the  appropriate  unit  of  analysis  is  the  image  sequence  rather 
than  a  single  image.  This  implies  that  the  system  must  capture 
and  operate  on  image  sequences  of  a  length  appropriate  to  the 
speed  of  events.  Because  the  robot's  world  possesses  continuity, 
the  system  should  take  advantage  of  information  discovered  in 
previous  views.  This  suggests  that  the  system  should  not  only 
build  internal  models  of  the  environment  from  successive  views, 
but  also  that  it  should  be  able  to  make  use  of  hypotheses  from 
this  model  in  interpreting  subsequent  images.  The  importance  of 
servoing  over  classification  implies  that  the  system  must  provide 
information  at  many  levels  of  analysis.  That  is,  it  may  be 
required  to  supply  information  on  range  to  points,  on 
inclinations  of  edges  and  surfaces,  on  translation  and  rotation 
velocities  of  features,  and  similar  descriptive  properties  of 
objects  in  space  and  time.  These  must  be  made  available  to  the 


robot  control  system  as  rapidly  as  they  ars  discovered,  prior  to 
and  independently  of  classification.  Above  all,  the  system  must 
operate  in  real-time;  a  lata  answer  is  no  answer  for  a  robot 
guidance  system.  It  is  this  which  compels  us  to  examine  special- 
purpose  architectures  as  a  means  for  accomplishing  the  other 
criteria. 

The  NBS  Robot  vision  System 

The  general  plan  of  a  robot  system  which  incorporates  all  of 
these  functions  is  easy  enough  to  sketch,  and  the  basic  ideas  are 
included  in  Figure  1,  which  is  adapted  from  a  robot  system  being 
developed  by  the  Sensory-Interactive  Robotics  Group  at  KBS.  On 
the  left  is  a  sensory  processing  hierarchy,  in  the  center  a 
knowledge  representation  system,  and  on  the  right  a  task- 
decomposition  and  control  hierarchy.  This  entire  system  could  be 
considered  a  "special  purpose  architecture",  in  the  sense  that  it 
is  a  special  purpose  machine  which  has  computers  as  components. 
However,  most  of  the  elements  are  currently  implemented  on 
individual  computers  of  traditional  design,  and  in  this  chapter  I 
intend  to  focus  primarily  on  some  particular  vision  elements 
within  this  system  which,  for  reasons  of  required  processing 
speed,  have  been  implemented  on  special  purpose  architectures  in 
the  sense  of  specially-designed  hardware. 


Before  considering  these  elements  in  detail,  a  brief  review 
of  the  system  operations  will  help  clarify  their  tasks  and  design 
goals.  The  sensory  processing  hierarchy  accepts  data  from  the 
sensors  (in  particular  cameras  for  vision  sensing)  and  attempts 
to  form  a  hierarchical  syntactic  description  of  the  current  view 
of  the  world.  Note  that  symbolic  and  parametric  information  is 
made  collaterally  available  to  the  system's  knowledge 
representation  at  every  level  of  this  process.  The  task- 
decomposition  and  control  hierarchy  on  the  opposite  side  accepts 
goals  in  the  form  of  external  commands.  It  attempts  to  use 
generic  knowledge  of  how  to  choose  actions  together  with 
particular  knowledge  of  the  current  state  of  the  world  to 
generate  actions  which  accomplish  the  goals.  Knowledge  of  the 
current  state  of  the  world  is  made  collaterally  available  to 
every  level  of  the  task-decomposition  and  control  hierarchy  from 
the  system  knowledge  representation. 

The  knowledge  representation  contains  much  more  than  the 
information  currently  available  from  the  sensory  processing 
hierarchy.  In  addition  to  information  which  comes  from  a  priori 
knowledge  sources,  and  knowlsdge  of  the  control  hierarchy's 
current  state,  the  knowledge  representation  contains  a 
description  of  the  world  built  up  over  all  past  views  from  the 
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camera.  It  thus  contains  much  information  which  is  not 
extractable  from  tha  currant  viawing  window,  but  which  baars  on 
tha  intarpratation  of  that  which  is.  Tha  currant  scene 
descriptions  from  tha  sensory  processing  hierarchy  must  be 
reconciled  with  this  information,  and  tha  sensory  processing 
hierarchy  in  turn  may  be  guided  by  it.  Similarly,  tha  knowledge 
representation's  model  of  the  world  must  be  formulated  into 
specific  answers  to  any  particular  instantaneous  requirement  of 
the  task-decomposition  and  control  hierarchy.  In  turn,  the  state 
of  the  action-generating  processes  provides  guidance  for  the 
knowledge  representation  in  interpreting  its  input  from  the 
sensory  system.  There  is  thus  a  two-way  flow  between  the 
knowledge  representation  and  the  sensory  processing  system  which 
"servoe"  the  internal  model  of  the  world  to  the  observed  data, 
and  which  provides  hypotheses  that  guide  the  processing  and 
interpretation  of  the  data.  A  similar  reciprocal  flow  links  the 
knowledge  representation's  internal  model  of  the  world  with  the 
questions  and  information  produced  by  the  control  hierarchy. 

The  dimension  along  which  the  control  hierarchy  is  divided 
into  functional  levels  (task,  subtask,  elemental  cartesian  move, 
joint-space  velocity)  is  not  the  same  as  the  dimension  along 
which  the  vision  processing  hierarchy  is  divided  into  levels 
(image-plane  point  properties,  image-plane  featvures,  world- frame 
objects.)  The  knowledge  representation,  which  is  not  itself 
hierarchically  structured,  contains  information  relevant  to  all 
of  the  different  descriptions  of  the  world  required  by  these 
levels.  Its  information  is  maintained  in  a  semantic  form  from 
which  syntactic  and  parametric  descriptions  of  the  world  adapted 
to  the  requirements  of  any  of  the  functional  modules  can  be 
produced.  The  repreeentation  scheme  of  this  semantic  form  is 
chosen  principally  for  convenience  in  maintaining  and  organizing 
the  information.  Tha  eyntactic  forms  of  the  various  levels  of  the 
sensory  and  control  systems  are  chosen  primarily  for  junctional 
utility  in  the  modules  concerned.  The  "syntactic/samantic 
transform"  modules  extract  level-specific  information  from  the 
knowledge  representation  and  instantiate  it  into  the  frame  and 
symbols  needed  for  any  given  functional  module.  In  the  other 
direction,  they  contain  systems  for  incorporating  level -specific 
syntactic  and  parametric  sensory  or  control  information  into  the 
internal  model. 

This  diagram  outlines  a  general  conception  of  the  relations 
of  sensory  and  other  knowledge  in  a  robot  system.  What  is  vastly 
more  difficult  is  to  specify  the  algorithms  and  hardware  required 
to  give  substance  to  the  boxes  in  such  a  diagram.  A  first 
generation  system  was  constructed  entirely  from  micro-computers. 
It  used  binary  image  processing  of  both  normal  and  structured- 
light  images.  The  knowledge  representation  and  the  range  of 


requests  acceptable  from  the  control  hierarchy  were  very  limited 
in  scope.  Nonetheless,  this  system  could  accept  CAD  descriptions 
of  simple  parts,  recognize  them,  and  provide  servo  information 
that  allowed  the  control  system  to  manipulate  them  in  real-time. 
A  second  generation  system  has  been  under  development  for  some 
time.  This  project  includes  much  more  elaborate  structures  for 
knowledge  representation,  and  a  correspondingly  richer  sensory 
processing  capability  which  employs  grayscale  vision.  Most  of 
this  second  generation  system  development  effort  consists  of 
software  improvements  still  running  in  a  multi-microcomputer 
dataflow  (as  opposed  to  distributed)  system.  With  the  move  to 
gray-scale  vision,  however,  it  became  apparent  that  the  early 
stages  of  image  processing  involved  computations  which  could  not 
be  handled  in  real-time  by  microcomputers.  Further,  it  was  felt 
that  for  these  early  image-processing  functions  a  sufficiently 
good  understanding  of  the  requirements  of  robot  vision  existed  to 
justify  the  development  of  special-purpose  vision  architectures. 
Accordingly,  an  image  preprocessor  (called  PIPE,  for  Pipelined 
Image  Processing  Engine)  and  a  feature  extractor  (called  ISMAP 
for  Iconic  to  Symbolic  KAPper)  were  designed.  The  first 
prototypes  of  these  devices  are  now  in  operation.  Their  relations 
are  depicted  in  Figure  2,  which  shows  the  data-flow  paths  for  the 
PIPE  and  ISMAP  elements,  to  and  from  the  upper  levels  of  the 
system  (indicated  here  as  "host  memory"). 

These  devices  will  replace  microcomputers  as  the  first  two 
elements  of  the  sensory  processing  hierarchy  in  the  second 
generation  of  the  robot  system  depicted  in  figure  1.  The  first  of 
these  elements  is  an  initial  processing  stage  (image  pre¬ 
processing)  which  accepts  an  image  consisting  of  an  array  of 
gray-scale  values  and  produces  a  similar  array  of  symbolic  values 
which  encode  features  of  the  gray-scale  array,  such  as  edges,  or 
textures.  This  process  of  feature  detection  produces  an  iconic 
description  of  image  features  from  an  iconic  description  of  image 
intensities.  That  is,  for  both  the  input  and  the  output  of  this 
process,  the  global  geometric  relations  of  the  input  values  or 
the  output  features  are  implicit  in  their  location  in  the  image 
array.  The  feature  detection  process  transforms  one  iconic  array 
into  the  other  by  making  local  properties  explicit  as  symbols.  In 
most  cases  the  process  attempts  to  describe  local  properties 
(features)  which  achieve  some  independence  of  circumstances  such 
as  illumination. 

Following  the  feature  detection  process,  the  second  element 
of  the  system  performs  a  feature  extraction  step  (iconic  to 
symbolic  mapping.)  This  begins  the  process  of  producing  a 
relational  feature  description  of  objects  which  is  independent  of 
circumstances  of  viewing  position.  In  this  step,  the  iconic  image 
of  features,  in  which  symbol  values  are  indexed  by  location,  is 
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mapped  into  a  space  in  vhich  location  is  indexed  by  feature 
value.  This  produces  a  representation  in  vhich  the  data  vhich  vas 
previously  explicit  (the  feature  type)  is  nov  implicit  in  the 
location  of  the  data  in  the  representation.  The  image  location  of 
the  features,  vhich  vas  previously  implicit  in  the  iconic 
representation,  is  nov  the  explicit  data  represented  by  values  in 
the  nev  space.  This  accomplishes  tvo  goals;  it  permits  finding 
the  locations  of  features  of  interest  by  direct  indexing  rather 
than  by  exhaustively  searching  the  image,  and  it  extracts 
locations  as  data  to  vhich  subsequent  operations  can  be  applied. 
These  subsequent  operations  attempt  to  find  meaningful  relations 
among  the  location  data  vhich  correspond  to  geometric  relations 
of  features.  The  geometric  relations  discovered  then  serve  as 
input  to  subsequent  classification  processes. 

Design  Philosophy  of  PIPE 

In  designing  the  PIPE  and  I5MXP  devices  to  play  these  roles 
in  the  system,  ve  attempted  to  optimize  the  design  for  the 
special  requirements  of  robot  vision  discussed  above.  The 
principal  goals  selected  vere:  1)  real-time  processing  of  images 
at  field-rate,  2)  provision  for  interactions  betveen  related 
images,  such  as  those  arising  from  dynamic  image  sequences  or 
from  stereoscopic  vievs,  3)  provision  of  the  ability  to  apply 
different  algorithms  to  different  regions  of  the  image  in  real 
time,  4)  ability  to  perform  multi-resolution  image  processing, 
and  5)  provision  for  guiding  processing  by  knowledge-based 
commands  and  "hypothesis  images"  supplied  from  the  upper  levels 
of  the  system. 

PIPE  is  a  hardvare  device  specialized  for  parallel  image 
processing  rather  than  a  fully  general  purpose  parallel  computer. 
Through  its  design,  it  facilitates  a  variety  of  common  and 
important  image-processing  techniques,  as  veil  as  several 
experimental  approaches.  Within  the  broad  limits  of  the  processes 
it  supports,  it  is  an  extremely  fast  and  flexible  device.  On  the 
other  hand,  it  is  not  a  general  purpose  computer  and  it  is  not 
possible  to  program  arbitrary  algorithms  on  it,  or  at  least  not 
in  an  efficient  manner.  In  most  cases  ve  have  found  that 
processes  vhich  are  veil  suited  to  PIPE  may  be  substituted  for 
others  vhich  are  not,  while  accomplishing  the  same  image- 
processing  goal.  PIPE  if  intended  as  a  processor  for  local 
operations  on  images;  it  is  not  designed  to  perform  efficiently 
operations  that  require  global  knowledge  of  the  image.  It  is 
intended  that  PIPE  operate  in  conjunction  with  a  host,  such  as 
the  upper  levels  of  the  KBS  robot  vision  system,  vhich  will 
perform  global  image  operations,  relieved  of  the  processing 
burden  of  large  scale  repetitive  local  operations.  Thus,  PIPE 
itself  accepts  iconic  data  images,  and  typically  produces  iconic 


images  whose  pixel  values  ars  Boolean  vectors  describing  local 
properties  of  the  pixel  neighborhood.  PIPE  relieves  the  upper 
levels  of  the  system  of  costly  low-level  local  processing  which 
must  be  performed  over  the  entire  image  space. 

The  basic  design  of  PIPE  was  partly  inspired  by  an  earlier 
proposed  machine*'*  which  was  in  turn  adapted  from  biological 
models  of  image  processing.  The  essential  organization  is  a 
three-dimensional  architecture  consisting  of  a  sequence  of  image- 
processing  planes  (Figure  3.)  Each  processing  plane  has  storage 
arrays  which  receive  images,  and  operators  which  act  on  them.  The 
storage  arrays  may  be  considered  to  be  in-register  with  one 
another  in  the  third  dimension,  so  that  each  pixel  neighborhood 
in  an  array  occupies  a  step  in  a  pipeline  of  processes  extending 
up  through  the  stack  of  planes.  The  result  is  an  image-flow 
architecture  in  which  the  images  move  upwards  through  the  stack 
as  subsequent  images  replace  them  from  below.  At  each  stage  the 
operations  on  the  pixels  provide  interaction  in  the  lateral 
directions  with  neighbors  in  the  same  array,  and  in  the  vertical 
directions  with  neighbors  in  the  preceding  and  succeeding  arrays. 
Logically  the  machine  can  be  considered  as  a  bundle  of 
bidirectional  pipeline  processors  with  lateral  interactions. 

In  practice,  PIPE  actually  consists  of  stages  with  storage 
arrays  and  computational  modules  (figure  4.)  which  operate  ovsr 
every  pixel  neighborhood  in  the  arrays  in  a  single  field  time 
(1/60  sec.)  Images  are  transferred  from  stage  to  stage  at  field- 
rate  (60  images/sec)  by  three  concurrent  pathways  which  provide 
for  interacting,  image-flow  transfers  between  stages.  These 
interconnect  a  variable  number  of  identical  modular  image- 
processing  stages.  The  three  pathways,  shown  in  figure  3,  are: 
the  forward  pathway,  which  acts  as  a  traditional  pipelined  image- 
processing  path;  the  retrograde  pathway,  which  carries  images  in 
the  opposite  direction  (i.e.,  from  the  output  of  a  stage  to  the 
input  of  its  predecessor) ;  and  the  recursive  pathway,  which 
carries  an  image  from  the  output  of  a  stage  back  into  the  input 
of  the  same  stage. 

At  the  input  to  each  stage  (labeled  "combining  logic"  in 
figure  4.)  the  images  carried  by  the  three  pathways  may  be 
subjected  individually  to  any  arithmetic  or  Boolean  operation. 
Any  linear  arithmetic  or  Boolean  operation  may  then  be  used  to 
combine  them  into  a  final  input  image,  prior  to  its  storage  in 
one  of  two  buffers  within  the  stage. 

Within  each  stage,  the  computational  operators  may  act  on 
images  stored  in  either  or  both  of  the  two  buffers.  These 
operators  are  contained  in  the  sections  marked  "operations"  in 
figure  4.  Each  stage  can  perform  two  simultaneous  and  independent 
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arithmetic  or  Boolean  neighborhood  operations.  Following  the 
neighborhood  operations,  and  prior  to  output  from  the  stage,  the 
images  resulting  from  the  neighborhood  operations,  or  either  of 
the  images  in  the  buffers,  may  undergo  a  further  transformation 
by  an  arbitrary  function  of  one  or  two  arguments.  If  the 
function  is  of  two  arguments,  the  second  argument  may  be  drawn 
from  any  source  within  the  stage,  including  the  transformations 
of  same  or  the  other  image.  The  forward,  recursive,  and 
retrograde  pathways,  in  any  combination,  may  accept  images  from 
the  result  of  any  of  these  operations  or  from  any  of  the  buffers. 
This  is  controlled  by  the  section  marked  "distribution  logic"  in 
figure  4. 

The  representation  of  figure  4  shows  the  actual  physical 
grouping  of  processing  elements  and  storage  as  they  exist  in  the 
machine  for  structural  reasons.  Functionally,  the  combining 
logic  of  one  stage  and  the  operations  and  distribution  sections 
of  the  preceding  stage  may  form  a  more  convenient  conceptual 
processing  unit  for  many  purposes.  Figure  5  shows  the 
architecture  of  PIPE  redrawn  in  this  fashion,  where  the 
processors  are  the  conceptual  rather  than  the  physical  units.  In 
this  figure  the  frame-buffer  storage  elements  can  be  unbundled 
from  the  processing  units,  so  that  the  tri-directional  image-flow 
relations  are  easier  to  visualize. 

In  an  alternative  mode,  one  of  the  two  image  buffers  in  each 
stage  may  serve  as  a  map  for  selecting  the  processing  algorithms 
being  applied  to  the  contents  of  the  other  buffer.  In  this  mode, 
PIPE  functions  as  multi- instruction  stream  multi-data  stream 
(MIMD)  machine,  with  one  buffer  defining  regions  of  interest  over 
which  each  set  of  algorithms  shall  be  applied,  on  a  pixel-by- 
pixel  basis,  as  the  image  is  processed.  In  the  prototype  version, 
sixteen  such  alternative  processing  algorithms  may  be  selected 
for  different  regions  of  the  image  within  a  single  field  time. 
However,  this  is  easily  increased  (up  to  256  processing 
algorithms)  simply  by  adding  additional  storage  to  each  stage  to 
hold  the  appropriate  tables  and  parameters. 

The  third  mode  permits  PIPE  to  function  as  a  multi¬ 
resolution  pyramid  machine.  In  this  mode,  the  images  carried  by 
the  forward  pathway  are  reduced  in  size  by  one  half  at  each 
stage,  while  sizes  of  the  images  carried  by  the  retrograde 
pathway  are  doubled  at  each  stage.  The  images  carried  by  the 
recursive  pathway  remain  unchanged  in  resolution.  Any  combination 
of  stages  may  operate  in  this  mode,  under  program  control. 

In  addition  to  the  pathways  mentioned,  PIPE  contains  four 
"wild  card"  busses  which  may  be  used  to  transport  images  from 
higher  levels  of  processing,  or  from  any  buffer  in  the  machine. 


into  any  othar  buffar  in  the  machine.  It  also  has  two  special 
stages,  for  input  and  output,  which  communicate  with  the  sensors, 
with  processors  in  upper  levels  of  the  system,  or  with  special 
auxiliary  devices.  DMA  channels  allow  video  display  or  streaming 
input  and  output  from  any  buffer  in  the  machine. 

In  any  of  PIPE'S  operating  modes,  the  operations  of  every 
stage  are  completely  independent,  and  can  be  completely 
reconfigured  in  the  intar-field  time  by  an  associated  stage- 
sequencer  control  unit,  which  in  turn  may  select  stage 
configurations  from  a  stored  sequence,  or  on  command  from  the 
higher  levels. 

Operation  of  PIPE 

The  initial  version  of  PIPE  consists  of  a  sequence  of 
identical  image  processing  stages,  sandwiched  between  a  special 
input  processor  and  a  special  output  processor.  The  input 
processor  accepts  an  image  from  any  device  that  encodes  two- 
dimensional  images.  It  serves  as  a  buffer  between  the  rest  of  the 
image  processing  stages  and  the  outside  world.  Each  successive 
processing  stage  receives  image  data  in  an  identical  format, 
operates  on  it,  and  passes  it  on  to  the  next  stage  for  further 
processing.  This  sequence  is  repeated  every  television  field- 
time.  When  an  image  emerges  at  the  far  end  of  the  sequence,  it  is 
processed  by  the  special  output  stage  and  presented  to  upper 
levels  of  the  robot  vision  system  or  to  a  host  computer. 

The  image  processing  stages  between  the  input  and  output 
stages  are  all  identical  and  interchangeable ,  but  can  each 
perform  different  operations  on  the  image  sequences  that  they 
encounter.  Usually,  each  image  processing  stage  receives  three 
input  images  and  transmits  three  output  images.  The  input  images 
arrive  from  the  processing  stage  immediately  behind  each  stage, 
from  the  processing  stage  immediately  ahead,  and  from  a  result  of 
the  preceding  operation  performed  by  the  image  processing  stage 
itself.  Similarly,  the  results  of  processing  a  current  image  are 
transmitted  by  each  stage  to  the  next  processing  stage  in  the 
sequence,  to  the  immediately-preceding  processing  stage,  and 
recursively  back  into  the  image  processing  stage  itself.  These 
three  outputs  are  usually  not  identical,  and  each  may  furnish 
part  or  all  of  the  inputs  to  other  stages  for  the  subsequent  step 
in  processing.  The  three  inputs  may  be  weighted  and  combined  in 
each  image  processing  stage,  in  any  fashion,  before  they  are 
processed. 

In  addition  to  these  input  and  output  paths,  the  four 
'wildcard'  paths  may  be  sources  or  destinations  for  input  and 
output.  Unlike  the  three  principal  paths  connecting  the  stages, 
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these  wildcard  paths  ara  common  to  all  stagas,  so  that  only  one 
stags  can  writs  to  a  particular  wildcard  path  at  a  tins,  but  any 
or  all  stagas  can  accapt  input  from  than.  Tha  wildcard  paths 
allow  images  to  ba  movad  arbitrarily  batwaan  stagas,  instaad  of 
having  to  stap  through  from  stags  to  stags.  Thara  ara  no 
restrictions  on  tha  number  of  destinations  for  an  image  output  to 
a  wildcard  path. 

Although  thara  ara  physically  only  three  pathways  between 
stages  (excluding  tha  "wildcard"  busses) ,  every  pixel 
neighborhood  in  an  image  is  processed  and  sent  over  these  paths 
in  every  field  time.  Tha  result  is  that  PIPE  simulates  a  fully 
parallel  image-flow  machine;  each  pixel  appears  to  have  a  real¬ 
time,  private  line  to  an  homologous  pixel  processor  in  three 
target  stages.  There  are  numerous  reasons  for  requiring  the  three 
input  and  output  paths  from  each  image  processing  stage.  It  is 
clear  that  the  forward  path  allows  a  chain  of  operations  to  be 
performed  on  sequential  images,  giving  rise  in  real  time  to  a 
transformed  image  stream  (with  a  constant  delay) .  Similarly,  the 
recursive  path  allows  a  pipeline  of  arbitrary  length  to  be 
simulated  by  any  stage.  It  also  facilitates  the  use  of  algorithms 
that  perform  many  iterations  before  converging  to  a  desired 
result  (e.g.,  relaxation  algorithms,  or  the  simulation  of  large 
neighborhood  operators  by  successive  applications  of  smaller 
neighborhood  operators) .  The  path  to  the  preceding  image 
processing  stage  allows  operations  to  be  performed  using  temporal 
as  well  as  spatial  neighborhoods.  It  also  allows  information 
inserted  at  the  output  stage  by  the  upper  levels  of  the  system  to 
participate  in  the  processing  directly.  This,  for  example, 
allows  expectations  or  image  models  to  be  used  to  guide  the 
processing  at  all  levels,  on  a  pixel-by-pixel  basis. 

It  is  helpful  in  understanding  the  functions  of  these 
processing  pathways  to  consider  each  in  isolation  first.  If  only 
the  forward  input  path  is  operative  (i.e.,  the  combining  weights 
for  the  retrograde  and  recursion  paths  are  set  to  zero) ,  we  have 
a  simple  image  pipeline  processor  which  can  sequentially  apply  a 
variety  of  neighborhood  operators  to  the  series  of  images  flowing 
through  it.  It  can  perform  either  arithmetic  or  Boolean 
neighborhood  operations  and,  by  thresholding,  convert  an 
arithmetic  image  into  a  Boolean  image.  For  example,  it  might  be 
used  to  smooth  an  arithmetic  gray  scale  image,  apply  edge 
detection  operators  to  it,  threshold  the  "edginess"  value  to  form 
a  binary  edge  image  and  then  apply  Boolean  neighborhood 
operations  to  find  features  in  the  edges.  The  operation  types  and 
parametric  values  for  these  operations  would  be  set  individually 
for  each  stage  by  the  stage  control  units,  which  in  turn  would  be 
instructed  (for  example  from  the  upper  levels  of  the  system)  via 
the  input  marked  "stage-by-stage  processing  control"  in  Figure  3. 


A  second  single-path  case  results  if  both  the  forward  and 
retrograde  paths'  combining  functions  are  zero.  Assume  that 
images  had  previously  been  loaded  into  all  the  processing  stages. 
The  recursive  path  would  then  cause  the  image  field  in  each  stage 
to  pass  through  the  forward  or  backward  transformation  operation 
recursively,  while  the  images  "marched  in  place".  A  variety  of 
relaxation  operations  can  be  implemented  in  this  way. 

For  the  final  single  path  case,  consider  that  the  weights 
assigned  to  the  forward  and  recursive  paths  are  zero,  leaving 
only  the  retrograde  pathway  active.  When  the  set  of  such  paths  is 
considered  in  isolation,  it  becomes  clear  that  it  forms  a 
processing  chain  that  is  a  retrograde  counterpart  of  the  forward 
pipeline.  Zt  would,  in  fact  be  possible  to  select  appropriate 
retrograde  transformations,  insert  fields  of  data  at  the  back  of 
the  device,  process  them  through  to  the  front,  and  get  the  same 
result  as  running  the  system  in  the  normal  direction.  The  purpose 
of  this  is  not  to  provide  a  bidirectional  image  processor,  but  to 
permit  input  (at  the  "output"  end  of  the  device)  of  synthesized 
images.  Such  images  influence  the  processing  of  the  normally 
flowing  images  by  direct  interaction,  and  correspond  to 
"expectancies",  "models",  "hypotheses",  or  "attention  functions." 

The  retrograde  images  are  not  only  able  to  affect  processing 
of  the  forward  images,  but  are  affected  themselves  by  interaction 
with  them.  (The  effects  that  the  two  image  sequences  exert  on 
each  other  may  be  different  because  the  neighborhood  operators  on 
the  forward  and  backward  paths  are  independent) .  Retrograde 
images  will  usually  be  generated  by  knowledge-based  processes  in 
higher  level  components  of  the  robot  vision  system.  They  may 
initially  appear  in  Boolean  form,  but,  as  shown  in  Figure  6, 
provision  is  made  for  all  four  possible  combinations  of 
arithmetic  and  Boolean  inputs  and  outputs  in  the  combining  logic 
between  stages.  This  permits  a  descending  Boolean  image  to  be 
instantiated  into  arithmetic  image  values  by  interaction  with  the 
ascending  arithmetic  image.  This  occurs  in  the  same  stage  in 
which  the  ascending  arithmetic  image  representation  is 
thresholded  to  become  a  Boolean  image  (Stage  "N"  of  figure  6.) 
Both  the  ascending  data  image  and  the  descending  "hypothesis" 
image  can  pass  across  this  interface.  A  major  function  of  PIPE 
will  be  to  explore  the  effectiveness  of  various  approaches  to 
hypothesis-guided  iconic  image  processing. 

There  are  a  great  many  ways  in  which  various  combinations  of 
these  pathways  can  be  used  in  processing  by  operating  on 
combinations  of  processed  images  arriving  over  the  three 
pathways.  I  will  consider  only  a  few  illustrative  examples. 
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Modes  of  interaction  between  arithmetic  and  Boolean 


If  the  preceding  and  succeeding  image  fialds  are  considered 
to  contain  future  and  past  instances  of  a  field,  respectively  (as 
is  true  in  a  dynastic  image) ,  then  forward  path  corresponds  to  a 
path  from  the  future,  recursion  to  a  path  from  the  present,  and 
the  retrograde  path  to  one  from  the  past.  The  weighted  sum  of 
the  three  paths'  contents  the  forms  a  convolution  operation  on 
the  temporal  neighborhood  of  a  pixel.  This  may  occur  at  the  same 
time  as  a  spatial  neighborhood  convolution  operation  is  being 
performed  on  the  contemporary  spatial  neighborhood  in  each  stage, 
so  that  a  combined  spatio-temporal  convolution  is  achieved. 

Boolean  information  can  be  processed  in  an  interesting  way 
by  combining  the  outputs  of  the  forward  operator  from  the 
previous  stage  and  the  recursive  input  from  the  current  stage. 
Consider  the  case  of  a  single  stage  treated  in  this  fashion  for 
eight  field-times,  using  "SHIFT  recursive  then  OR  recursive  with 
forward"  as  the  combining  operation.  Let  the  incoming  images  from 
the  previous  stage  (forward  path)  have  Boolean  values  resulting 
from  thresholding  of  eight  successive  and  independent  feature 
detection  operations  such  as  oriented  edge  detection,  texture 
measures,  etc.  The  second  stage  will  accumulate  images  from  the 
eight  preceding  Boolean  operations  into  an  image  composed  of 
eight-bit  Boolean  vectors,  each  bit  representing  the  presence  or 
‘absence  of  an  independent  image  property  at  that  location. 
Subsequent  Boolean  neighborhood  operations  may  then  apply 
independent  operators  to  each  bit  plane  of  a  neighborhood  of  such 
vectors . 

HARDWARE  DETAILS  OF  THE  PIPE  STAGES. 

Input  Stage 

A  special  input  stage  is  used  to  capture  and  buffer  images 
from  input  devices.  This  allows  PIPE  to  accept  digital  or  analog 
signals  from  any  device  using  standard  RS-170  television  signals 
and  timing.  Either  of  two  selectable  analog  signals  is  digitized 
by  an  eight -bit  real-time  digitizer.  The  input  stage  is  capable 
of  acquiring  a  digitized  image  of  256  x  240  pixels  while 
remaining  synchronized  with  RS-170  signals.  Alternatively,  it  can 
|  capture  256  x  256  pixel  images  from  non-RS-170  signals  while 

internally  employing  non-standard  pixel  rates.  It  can  continually 
capture  such  images  at  standard  television  field  rates,  and  place 
them  in  either  of  the  two  field  buffers  contained  in  the  input 
stage,  while  storing  an  image  into  one  of  these  buffers  the  input 
stage  can  also  simultaneously  store  an  image  into  either  buffer, 
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such  as  a  diffsrsncs  image ,  formed  by  an  MU  operation  between 
ths  incoming  imags  and  a  previously  captured  image.  The  contents 
of  either  of  the  buffers  in  the  input  stage  can  be  sent  to  the 
first  of  the  processing  stages,  while  the  next  image  is  being 
acquired . 

PIPE  accepts  eight -bit  input  data,  and  this  precision  is 
maintained  throughout  the  machine.  Intermediate  arithmetic 
operations  within  subsequent  stages  are  carried  to  sufficient 
precision  to  insure  no  loss  of  accuracy  when  the  result  is 
rounded  to  eight  bits  for  transmission  to  subsequent  stages.  The 
data  may  be  treated  as  either  unsigned  eight-bit  numbers,  or  as 
two's  complement  signed  numbers  with  the  high  bit  indicating  the 
sign,  or  as  boolean  vectors  of  eight  independent  bits.  All  stages 
may  independently  select  the  representation  employed,  so  that 
unsigned  input  data  may  be  processed  until  an  operation  which 
generates  negative  values  occurs,  treated  as  signed  data 
thereafter  until  a  thresholding  operation  occurs,  and  then 
treated  as  Boolean  data. 

Processing  Stages 

Following  the  Input  stage,  there  are  a  series  of  modular 
processing  stages  (MPS) .  The  MPSs  are  the  "stages'*  referred  to  in 
the  preceding  sections,  and  are  the  elements  which  perform  most 
of  PIPE'S  processing.  All  MPSs  are  of  identical  modular 
construction,  and  are  physically  interchangeable  simply  by 
switching  circuit  boards.  Thus,  any  MPS  can  operate  at  any 
position  in  the  processing  chain,  and  the  processing  chain  can 
have  a  variable  length.  Eight  MPSs  are  employed  for  the  present 
development  phase  of  PIPE.  The  block  diagram  of  a  MPS  is 
portrayed  in  figure  7. 

The  pre-storage  input  section  of  the  Nth  MPS  accepts  three 
eight-bit  256  x  256  pixel  images  as  input.  These  come  from  the 
forward  output  of  the  N-lst  MPS,  from  the  recursive  output  of  the 
operation  performed  on  the  previous  contents  of  the  Nth  MPS,  and 
from  the  retrograde  output  of  the  N+lst  MPS.  Each  data  stream  may 
consist,  independently  of  the  other  two,  of  arithmetic  or  Boolean 
(eight-bit  Boolean  vector)  data,  but  a  given  data  stream  entering 
a  MPS  must  be  entirely  Boolean  or  arithmetic  within  any  single 
image  field. 

Before  generating  a  final  eight-bit  image  from  the  three 
data  streams,  the  input  section  of  each  MPS  performs  an  arbitrary 
table-lookup  transformation  on  each  them  independently  and 
simultaneously  (forward,  backward,  and  recursive  L.U.T.  in  figure 
7.)  Each  of  the  three  lookup  tables  is  one  of  three  sets  of  32 
tables  selectable  by  the  stage  operation  micro-instruction.  The 
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resulting  three  Boolean  and/or  arlthnatlc  data  streams  are  then 
combined  through  independently-programmable  full-function  ALUs 
into  a  single  arithmetic  or  Boolean  data  stream  (figure  7,  ALU-A 
and  ALU-B. )  This  data  stream  is  then  used  to  load  either  of  the 
two  selectable  field  buffers  within  the  MPS  (buffer  A  and  buffer 
B  in  the  figure.)  Alternatively,  either  or  both  buffers  can  be 
filled  using  the  wildcard  busses  or  by  direct  DMA  from  the  host. 
The  contents  of  both  of  these  field  buffers  are  then  available  to 
subsequent  operations  of  the  MPS.  External  device  access  to  the 
data  in  these  buffers  is  also  available;  an  external  device  may 
read  from  or  write  into  either  buffer  in  a  random  access  manner 
at  400,000  pixels/sec.,  with  auto- indexed  addressing  supported  on 
command.  The  wildcard  busses  provide  streaming  access  to  external 
devices  (including  monitors)  at  pixel  rates. 

The  hardware  that  implements  the  output  functions  of  the 
MPS,  subsequent  to  the  field  buffer  storage  step,  is  physically 
contained  on  a  separate  circuit  card  to  allow  it  to  be  replaced 
with  other  special  functional  modules,  should  this  be  desirable. 
This  circuitry  is  represented  by  the  area  to  the  right  of  the 
frame  buffers  in  Figure  7.  For  neighborhood  operations,  an  eight- 
bit  image  is  selected  by  reading  the  contents  of  one  of  the  two 
field  buffers  in  the  MPS.  The  image  is  transformed  by  a  single¬ 
valued  mapping  through  one  of  32  program-selectable  loolcup-tables 
(pre-nop  L.U.T.),  and  the  pixels  of  the  resulting  image  are 
passed  to  two  neighborhood  operators  (NOP  A, and  NOP  B) ,  of  which 
there  are  two  kinds.  The  first  type  of  neighborhood  operator  is 
an  arithmetic  convolution  operation,  while  the  second  is  a 
Boolean  operation.  For  either  operator,  the  neighborhood  of 
operation  is  (at  present)  3x3  pixels  square,  and  the  operation 
is  accomplished  in  200  nsec.  Pixel  neighborhoods  are  generated  by 
passing  the  data  stream  through  a  3 -line  buffer. 

In  the  arithmetic  case,  the  convolution  operation  uses 
arbitrary  positive  or  negative  eight-bit  neighborhood  weights, 
and  maintains  twelve-bit  accuracy  in  its  intermediate  results. 
The  final  eight-bit  arithmetic  result  busses  is  produced  by  non- 
biased  rounding,  from  a  20-bit  sum.  This  insures  that  no  loss  of 
precision  occurs  within  a  stage  due  to  arithmetic  underflow  or 
overflow.  The  full  eight  bit  precision  of  the  input  is  thus 
maintained  between  stages  throughout  the  machine.  In  the  Boolean 
case,  the  neighborhood  operation  consists  of  arbitrary  Boolean 
operations  (a  sum-of -products  AND-OR  array  equivalent)  between 
the  set  of  all  the  pixels  of  the  data  neighborhood,  and  the  set 
of  all  corresponding  pixels  of  an  arbitrarily  specified 
comparison  neighborhood.  Any  bit  of  either  neighborhood  may  be 
independently  defined  as  true,  false  (complemented),  or  "don't 
care".  Each  of  the  eight  bit-planes  forms  an  independent  set  of 
inputs,  subject  to  independent  neighborhood  operations.  As  a 


result,  eight  independent  one-bit  results  are  obtained  from  a 
single  pass  of  the  data  through  the  pipeline,  yielding  an 
orthogonal  eight-bit  Boolean  vector  as  output. 

Both  neighborhood  operators  are  applied  independently  and 
simultaneously.  They  operate  on  the  same  data  stream,  using 
neighborhood  operations  which  may  be  different.  Their  outputs,  or 
the  contents  of  either  of  the  field  buffers,  may  be  independently 
subjected  to  a  second  transformation  by  either  of  two 
programmable  functions.  The  first  of  these  is  a  lookup-table 
mapping  function  (TVF  L.U.T.  in  figure  7.)  This  transformation 
may  be  a  function  of  one  or  two  eight-bit  arguments.  If  two 
arguments  are  used,  they  may  be  taken  from  homologous  pixels  of 
either  of  the  NOP  outputs  or  either  of  the  field  buffers.  The 
lookup-table  is  program-selectable  from  a  number  of  stored  tables 
which  depends  on  the  number  of  arguments  to  the  input.  The  other 
function,  with  inputs  selectable  from  the  same  sources,  is  an  ALU 
with  two  eight-bit  inputs  (ALU  C.) 

Finally,  the  contents  of  either  of  the  buffers,  the  results 
of  either  of  the  neighborhood  operators,  and  the  results  of 
either  of  the  two  functions  of  two  arguments,  may  be  sent  to  the 
three  output  pathways,  or  to  the  four  wildcard  busses,  in  any 
combination,  by  a  crossbar  switching  network  shown  at  the  extreme 
right  in  figure  7. 

These  basic  features  of  the  processing  stages  may  be  altered 
in  operation  by  one  of  two  special  processing  modes.  The  MIMD,  or 
"region  of  interest"  mode  allows  each  MPS  to  switch  between 
alternative  operation  sets  on  a  pixel-by-pixel  basis.  In  this 
mode,  one  image  buffer  of  the  stage  contains  a  map  of  the 
operations  to  be  performed  on  homologous  pixels  of  the  image 
buffer  undergoing  operations.  In  this  mode,  the  contents  of  the 
operation-controlling  buffer  are  treated  as  offsets  into  the  on¬ 
board  micro-operation  store  of  the  stage,  and  select  the 
operative  micro-operation  code  for  each  pixel-time.  Each  micro¬ 
operation  word  controls  the  routing  of  data-flow  within  the 
stage,  the  ALU  functions,  the  identity  of  the  look-up  tables 
employed,  and  the  selection  of  output  paths.  Potentially,  up  to 
256  different  alternative  operation  sets  could  be  specified  by 
the  eight-bit  contents  of  each  pixel  in  the  map.  In  practice,  the 
number  of  alternative  operation  sets  selectable  during  a  field 
processing  time  will  be  limited  by  the  amount  of  memory  available 
within  the  stage  to  store  them,  which  may  be  enlarged  at  will. 
The  operation  sets  stored  in  the  available  memory  may  be  changed 
arbitrarily  between  fields. 

In  "Pyramid  Mode",  PIPE  allows  the  construction  of 
multiresolution,  "pyramid",  sequences  of  images.  The  basic 


operations  available  in  PIPE  for  constructing  image  pyramids  are 
sampling  and  pixel  doubling.  Sampling  is  used  to  reduce  the 
resolution  of  an  image,  while  doubling  is  used  to  increase  the 
size  of  an  image. 

Both  the  sampling  and  doubling  operations  are  performed  by 
manipulating  addressing  rates  within  a  stage.  Sampling  is 
achieved  via  the  forward  pathway  by  incrementing  the  destination 
image  addresses  half  as  fast  as  the  source  addresses.  That  is,  on 
each  row,  the  first  pixel  in  the  source  image  is  written  to  the 
first  pixel  in  the  destination  image.  The  second  source  pixel  is 
also  written  to  the  first  destination  pixel.  The  address  of  the 
destination  pixel  is  then  incremented,  and  the  procedure  is 
repeated.  The  same  process  is  used  to  sample  into  every  other 
row  in  the  destination  image.  The  result  is  that  the  destination 
image  is  one  quarter  the  resolution  of  the  source  image.  Doubling 
is  accomplished  via  the  retrograde  pathway  by  the  inverse  of  the 
sampling  process.  That  is,  the  addresses  in  the  source  image  are 
now  being  incremented  at  half  the  rate  of  those  in  the 
destination  image.  For  each  row  in  the  source  (reduced- 
resolution)  image,  two  identical  rows  are  output  in  the 
destination  image.  For  each  pixel  in  each  row  of  the  source 
image,  two  identical  pixels  are  stored  in  the  destination  image. 

The  simple  operations  of  image  sampling  and  pixel  doubling 
are  not  of  themselves  very  useful  except  for  a  narrow  range  of 
applications.  However,  they  may  be  combined  with  the  other 
operations  in  the  KPS,  so  that  a  much  broader  class  of  operations 
becomes  possible.  Prior  to  sending  the  images  over  the  forward 
and  backward  paths,  they  may  pass  through  the  operations  of  the 
output  section  of  the  MPS,  and  prior  to  storage,  they  may  pass 
through  the  operations  of  the  input  section  of  the  destination 
MPS.  For  example,  the  neighborhood  operator  can  be  used  to  smooth 
the  image  before  sampling.  By  iterating  the  neighborhood 
operation  prior  to  sampling,  the  effects  of  neighborhoods  larger 
than  three  by  three  can  be  obtained,  allowing,  for  example,  the 
construction  of  "Gaussian"  pyramids  .using  the  hierarchical 
discrete  correlation  procedure  of  Burt  .  By  passing  an  image 
through  one  neighborhood  operator  into  the  forward  path,  through 
the  other  into  the  recursive  path,  and  by  passing  the  retrograde 
path  directly  to  the  look-up  table  of  the  input  section  of  the  N- 
lst  MPS,  the  pyramidal  neighborhood  operations  of  Tanimoto's 
Hierarchical  Cellular  Logic4  may  be  realized. 

Edge  effects  that  arise  when  a  neighborhood  operator  is 
applied  are  dealt  with  in  the  same  manner  for  all  resolutions  of 
images,  and  tor  borders  of  MIMD  operation  regions  as  well  as  for 
frame  borders.  PIPE  automatically  provides  the  replication  or 
zeroing  of  border  pixels.  If  a  neighborhood  has  a  row  or  a  column 


that  lias  outsida  tha  boundaries  of  tha  image  (either  beyond  the 
image  buffer  itself  or  beyond  tha  extant  of  a  low-resolution 
image,  or  beyond  the  bound  of  a  MIMD  operation  type) ,  the  non¬ 
existent  pixels  are  zeroed  or  replaced  by  the  pixels  in  the 
border  row  or  column.  For  a  three  by  three  neighborhood,  this  is 
equivalent  both  to  reflecting  the  image  and  to  repeating  the 
border  pixels.  This  is  achieved  in  the  same  way  as  the  varying 
resolution  images  are  constructed,  i.e.,  by  manipulating  the 
address  lines  of  the  buffer. 

Output  Stage 

The  output  stage  performs  a  role  at  the  end  of  the 
processing  chain  similar  to  that  of  the  input  stage  at  its 
beginning.  The  final  MPS  delivers  its  forward  image  output  to 
either  one  of  a  pair  of  field  buffers  in  the  output  stage,  and 
can  simultaneously  read  from  the  other  buffer  of  the  output 
stage.  The  data  read  from  the  output  stage  is  used  as  the  input 
to  the  retrograde  path  of  the  final  MPS.  Without  interrupting  the 
image-processing,  either  buffer  of  the  output  stage  can  be  read 
from  or  written  into  by  an  external  device,  which  is  both  the 
consumer  of  the  processed  forward  data-flow  and  the  supplier  of 
data  for  the  retrograde  path. 

Sequencer  and  Stage  Control 

Prior  to  run  time,  the  pipeline  modulee  must  be  set  up  with 
the  individual  operations  to  be  carried  out.  The  upper  levels  of 
the  system  must  load  each  stage  with  appropriate  instructions  for 
all  the  processing  steps  to  be  employed.  Each  stage  has  storage 
for  up  to  256  128-bit  micro-operation  words,  each  of  which  can 
specify  the  entire  set  of  operations  for  the  stage.  This  store  is 
loaded  by  the  host  or  upper  levels  of  the  vision  system.  Each 
stage  also  has  control  circuitry  which  allows  a  micro-operation 
word  to  be  selected  from  the  stored  set  by  external  command.  The 
selection  command  can  come  either  from  the  host  or  from  PIPE'S 
Sequencer.  In  either  case  the  selection  can  completely 
reconfigure  the  stage  operation  during  the  inter-field  time  by 
selecting  a  new  stage  operation  set  for  the  next  field-time,  or  a 
group  of  stage  operation  sets  from  which  members  are  to  be 
selected  on  a  pixel-by-pixel  basis  in  MIMD  mode  during  the  next 
field-time. 

Normally,  the  cycle-by-cycle  selection  of  the  operations 
stored  in  each  stage  will  performed  by  the  Sequencer  module.  When 
active,  this  unit  issues  an  operation  selection  command  to  every 
stage  of  PIPE  at  the  beginning  of  each  field-time.  The  nature  and 
order  of  these  operation  selection  commands  are  determined  by  a 
program  in  the  sequencer  module  which  specifies  the  order  of 


commands  to  the  stages,  including  loops  and  branches,  so  that  the 
coordination  of  stage  operations  to  effect  various  pipe 
algorithms  is  accomplished.  This  sequencer  program  is  loaded  from 
the  host  prior  to  run  time.  At  any  time,  the  host  can  override 
the  sequencer  program  and  intervene  in  the  stage  operation 
selection  process  directly.  In  operation,  the  upper  levels  of  the 
system  may  instruct  the  sequencer  to  select  a  stage  program, 
instruct  it  to  branch  in  the  specified  sequence  of  operations,  or 
permit  it  to  follow  the  pre-set  sequence  of  operations  (which  may 
contain  branch  points  on  repetition  counts.) 

Programming  and  running  PIPE  thus  consists  of  specifying  the 
operations  to  be  performed  by*  each  stage,  loading  the 
corresponding  operators,  parameters,  and  tables  into  the  stages, 
and  then  loading  the  sequence  of  operations  for  the  stages  into 
the  sequencer  module.  For  program  development,  the  contents  of 
any  buffer  and  the  output  of  any  operator  in  the  system  can  be 
displayed  on  a  video  monitor,  while  the  sequencer  is  single- 
stepped. 

INPUT/OUTPUT  MODES 

In  keeping  with  its  role  as  a  real-time  vision  processor, 
PIPE  has  several  rapid  means  for  transferring  information  to  and 
from  the  host  device.  The  wildcard  busses  may  be  output  directly, 
either  as  digital  streams  or  after  conversion  to  analog  signals. 
The  analog  outputs  provide  for  interface  to  monitors  or  other 
display  devices,  while  the  digital  outputs  provide  streaming  I/O 
at  video  rates  to  any  specialized  device  which  can  accept  this 
rate  of  input.  This  streaming  I/O  mode  is  represented  in  figure 
8-a.  Figure  8-b  presents  a  parallel  port  I/O  scheme  which  allows 
the  host  device  access  to  any  of  the  PIPE  image  buffers  in  a 
random  access  fashion.  These  parallel  port  transfers  can  also 
function  in  auto- increment  and  DMA  modes.  The  most  sophisticated 
method  of  transferring  image  information  to  the  host  uses  the 
ISMAP  device  to  transfer  symbolic  descriptions  of  the  image  to 
the  host's  address  space.  This  transfer  may  be  bi-directional, 
and  can  take  any  PIPE  buffer  as  a  source  or  destination.  This  I/O 
technique  is  illustrated  in  Figure  8-c. 

ISMAP 

The  associated  device,  ISMAP,  will  map  processed  iconic 
images  into  symbolic  (property-indexed)  structures.  ISMAP 
describes  the  processed  images  produced  by  PIPE  by  mapping  the 
iconic  image  of  Boolean  vectors  produced  by  PIPE  into  ordered 
list  structures  in  the  address  space  of  processors  in  the  upper 
levels  of  the  system,  in  real  time.  A  major  function  of  ISMAP  is 
to  map  PIPE'S  image  addresses  into  a  space  ordered  by  symbolic 


picture  feature  descriptors.  This  saves  processors  in  the  upper 
levels  of  the  system  the  necessity  of  scanning  the  processed 
image  to  find  locations  of  items  of  interest.  Physically,  ISMAP 
resides  in  the  PIPE  backplane  as  an  integrated  part  of  the 
system. 

In  each  frame  time,  ISMAP  scans  the  entire  image  of 
iconics lly-or dared  feature  codes  produced  by  PIPE,  and  prepares 
three  organized  descriptions  of  the  feature  set  of  the  image  in  a 
symbolically-ordered  space  in  the  address  space  of  the  host.  The 
three  descriptions  of  the  image's  feature  content  produced  by 
ISMAP  are:  1)  a  histogram  of  feature  types,  organized  by  feature 
type,  2)  a  cumulative  histogram  of  feature  types  organized  by 
feature  type,  and  3)  lists,  organized  by  feature  type,  of  the 
image-coordinates  of  every  feature  in  the  image.  The  values  in 
the  cumulative  histogram  serve  as  pointers  to  the  heads  of  the 
lists  of  image  locations.  For  example,  if  the  host  needs  the 
locations  of  all  edges  with  a  certain  angle  of  inclination,  it 
can  find  the  beginning  and  the  length  of  a  compact  list  of  those 
locations,  in  its  own  address  space,  simply  by  looking  at  the 
cumulative  histogram  entry  for  the  desired  edge  direction.  It  may 
first  use  the  histogram  to  determine  what  edge  directions  are 
prominent  in  the  image,  or  perform  any  of  a  variety  of  other 
algorithms  on  this  information  without  the  burden  of  examining 
the  image-space  in  order  to  count  or  locate  features  of  interest. 

ISMAP  may  also  run  in  reverse,  creating  images  in  host 
buffers  from  feature  lists  created  or  selected  by  the  host.  This 
permits  ISMAP  to  be  the  first  level  of  the  upper  system  which  can 
generate  hypothesis  images  for  PIPE,  which  allows  the  PIPE/ISMAP 
combination  to  perform  operations  such  as  Hough  transforms 
easily. 

CONMAP 

ISMAP  is  one  of  two  PIPE  auxiliary  devices.  Another, 
CONMAP,  which  is  currently  under  construction,  stands  at  the 
front  of  PIPE  as  a  geometric  pre-processor  for  input  images.  The 
CONMAP  device  accepts  iconic  images  and  remaps  them  into 
topologically  identical,  but  geometrically  different  iconic 
images  prior  to  their  acquisition  by  PIPE.  In  its  final  form, 
CONMAP  will  implement  fully  general  conformal  mappings  on  images. 
There  are  a  variety  of  uses  for  such  transforms.  Log-polar 
mappings  have  been  suggested  by  Weiman  and  Chaikin3,  as  a  means 
of  converting  complex  image  properties,  such  as  rotation  and 
scale,  into  simple  translations.  Jain0  has  employed  a  similar 
transformation  for  utilizing  the  effects  of  camera  motion  in 
image  understanding.  Other  applications  include  simulation  of 
lens  geometries  which  permit  non-uniform  resolution  (e.g.  a  high- 
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resolution  "fovea"  with  lowsr-rssolution  periphery.)  CONMAP  will 
transform  images  at  field  rate,  and  its  operation  will  be 
transparent  to  PIPE  except  for  a  one-field  delay. 

PIPE  * S  SYSTEM  SOFTWARE. 

PIPE  is  provided  with  a  software  toolkit  which  includes  a 
number  of  highly-developed  programming  systems.  A  simulator 
exists,  although  it  is  not  used  when  a  PIPE  machine  is  available, 
since  the  other  software  development  tools  permit  interactive 
test  and  debugging  of  algorithms  in  real-time  on  the  machine.  All 
PIPE  system  software  is  graphics-based  and  runs  in  color-coded 
formats  on  the  screens  of  the  host. 

At  the  bottom  level,  a  software  "keyboard"  permits 
interactively  displaying  and  setting  any  bit,  byte,  or  word  in 
the  machine's  microcoded  operation  stores.  This  can  be  done  while 
PIPE  is  running  algorithms,  and  the  effects  observed.  This 
program  can  also  save  code  to,  or  load  code  from  disk  files. 

Microcoded  instruction  lists  for  the  operation  store  can  be 
produced  and  stored  in  disk  files  by  a  graphics-based  assembler. 
This  program  presents  diagrammatic  icons  portraying  the  logical 
functions  of  the  processing  stages  on  the  screen,  and 
interactively  fills  in  data-flow  paths  and  operations  as  the  user 
specifies  them.  The  program  has  an  expert-system  knowledge  of 
PIPE.  It  will  only  inquire  about  options  as  they  become  necessary 
based  on  user  choices,  and  it  will  reject  contradictory  or 
illegal  choices. 

Figure  9.  demonstrates  a  PIPE  programming  diagram.  Spatial 
progression  of  stages  is  from  left  to  right.  Temporal  progression 
of  machine  cycles  is  from  top  to  bottom.  A  single  row  indicates 
the  state  of  the  machine  in  a  single  field  time;  a  single  column 
indicates  the  successive  states  of  a  single  stage.  Diagrams  such 
as  this  illustrate  the  space-time  systolic  nature  of  PIPE 
programs,  and  can  be  transferred  directly  to  the  graphics  screens 
of  the  system  software  tools  for  program  generation. 

Another  graphics-oriented  program,  similar  to  an  assembler, 
generates  programs  for  the  global  sequencer  which,  in  turn, 
controls  selection  of  words  in  the  micro-operation  stores  of  all 
PIPE'S  stages  on  a  cycle-by-cycle  basis.  This  program  generates 
disk  files  which  can  be  loaded  into  PIPE'S  system  controller. 

PIPE  employs  many  tables  of  great  complexity.  A  table¬ 
generating  program  is  available  to  simplify  their  production.  For 
example,  extremely  complex  tables  are  required  for  the  bit-slice 
arithmetic  in  the  convolvers.  The  table  entries  have  no  simple 


relationship  to  tha  weights  of  the  neighborhood  mask.  The  table¬ 
generating  program  will  present  a  graphic  representation  of  the 
neighborhood ,  query  the  user  for  mask  values,  and  prepare  the 
required  tables  and  save  them  in  disk  files  for  later  loading 
into  PIPE.  A  variety  of  other  table  types  are  also  handled.  A 
related  program  finds  the  optimal  sequence  of  3  x  3  kernels  for 
the  production  of  any  N  x  N  convolution. 

At  the  next  level,  PIPE  is  programmable  in  PIPE  Command 
Language  (PCL) .  In  PCL,  it  is  possible  to  refer  to  stages, 
storage  areas,  and  operational  units  of  the  machine,  to  command 
their  states,  and  to  load  them  from  buffers  in  the  host. 

A  graphics-oriented  interactive  monitor  program  permits 
real-time  manipulation  of  PIPE  with  PCL.  This  functions  much  like 
a  low-level  monitor  program  on  a  traditional  computer,  except 
that  the  command  set  is  the  full  set  of  a  powerful  macro-assembly 
language  for  the  machine.  The  monitor  is  useful  principally  in 
debugging  programs  that  have  been  generated  by  other  means,  but 
it  is  possible  to  build  short  test  programs  with  it. 

The  principal  means  of  generating  PCL  programs  is  by  means 
of  a  low-level  compiler  which  accepts  commands  in  PIPE 
Intermediate  Format  (PIF) .  PIF  is  a  higher-level  language  which 
permits  the  programmer  to  command  successive  states  of  machine 
operations,  to  refer  to  named  disk  files,  and  to  associate  the 
files  with  the  machine  entities.  Such  files  may  be  the  micro¬ 
operation  lists,  sequencer  programs,  and  tables  generated  by 
auxiliary  programs.  These  files  represent  operations  of  broad 
general  utility,  and  form  a  library  upon  which  PIF  programmers 
can  draw. 


This  compiler  can  operate  either  from  PIF  programs  stored  in 
disk  files,  or  interactively  with  the  user  as  he  issues  PIF 
commands.  Most  of  the  existing  applications  software  for  PIPE  has 
been  written  in  PIF.  It  is  intended  as  the  intermediate  format 
which  will  be  generated  by  compilers  for  higher-level  image- 
processing  languages  now  under  construction. 


Software  tools  already  available  for  PIPE  would  permit  a 
competent  PIPE  programmer  to  write  all  PIPE  software  required  for 
most  applications  at  a  realistic  level  and  with  reasonable 
efficiency,  although  the  task  would  be  made  easier  if  higher- 
level  software  tools  currently  being  produced  were  available  now. 
A  growing  body  of  basic  application  routines,  for  feature 
detection,  image  enhancement,  and  similar  elementary  functions 
already  exists.  About  one  day's  time  is  sufficient  for  a 
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Figure  9.  A  PIPE  programming  diagram.  Spatial  progression  of 
stages  Ti  from  left  to  right.  Temporal  progression  of  machine 
cycles  is  from  top  to  bottom.  A  single  row  indicates  the  state  of 
the  machine  in  a  single  field  time;  a  single  column  indicates  the 
successive  states  of  a  single  stage. 


competent  programmer  to  write  such  a  routine.  Experience  suggests 
that  a  doctoral-level  computer  scientist  can  become  an  expert 
PIPE  programmer  in  less  than  one  month's  time. 


DISCUSSION. 

PIPE  is  designed  as  a  front-end  processor  for  low-level 
iconic-to-iconic  image  processing.  It  is  intended  to  perform 
transformations  on  images  to  extract  features  similar  to  those  in 
the  primal  sketch  of  Marr. 7  These  features  make  intensity  changes 
and  local  geometric  relations  explicit  in  images,  while 
maintaining  the  spatial  representation.  In  this,  PIPE  differs 
from  many  processors  designed  for  image-processing.  These  other 
processors  are  usually  designed  to  perform  both  local  and  global 
image-processing  tasks,  often  in  an  interactive  environment. 

g 

A  recent  survey  by  Reeves  divides  image-processing  tasks 
into  two  classes.  Low  level  image  processing  usually  modifies 
parts  of  images,  but  maintains  the  image  array.  Higher  level 
processing,  however,  works  on  symbolic  representations  of  the 
contents  of  images.  Low  level  processing  has  usually  given  rise 
to  architectures  based  on  single  instruction  stream,  multiple 
data  stream  (SIND)  structures.  The  higher  level  functions  are 
usually  carried  out  using  processors  based  on  multiple 
instruction  stream,  multiple  data  stream  (MIMD)  structures.  The 
design  of  PIPE  allows  it  to  act  as  a  SIMD  pipeline,  or  as  a 
(restricted}  MIMD  pipeline.  The  MIMD  mode  is  entered  whenever  the 
region-of-interest  operators  are  used.  The  limitations  on  these 
operators  are  that  there  are  at  most  256  different  operators 
available  per  stage,  and  that  using  the  region  of  interest 
generally  precludes  using  some  other  operators,  such  as  the 
functions  of  two  arguments.  Using  the  retrograde  pathway  to 
insert  expectations  from  the  host  into  the  image  analysis  process 
also  blurs  the  distinction  between  high  level  and  low  level 
processes . 

Some  general  features  of  PIPE'S  architecture,  multiple 
pipelined  planes  of  processing,  and  the  concept  of  the  iconic-to- 
symbolic  remapping  performed  by  ISMAP,  are  drawn  in  spirit  from  a 
more  elaborate  machine  .proposed  but  never  constructed  by 
McCormick,  Kent,  and  Dyer.'2  That  device  was  in  turn  inspired  by 
certain  observations  on  the  nature  of  processing  in  the  visual 
cortex.  Although  PIPE  is  in  some  respects  a  simpler  device,  it 
also  carries  that  analogy  further  with  the  implementation  of  the 
backward  pathway  which  emulates  the  connectivity  in  the 
biological  vision  system,  where,  with  the  exception  of  the  final 
link  between  the  retina  and  the  lateral  geniculate  nucleus, 
there  are  in  fact  more  fibers  descending  from  higher  to  lover 


levels  of  processing  than  vlca  versa.  The  arrangement  of 
interactions  among  the  three  pathways ,  with  combination  of 
images  into  stage  buffers  and  separate  operations  available  into 
emerging  pathways  also  is  drawn  from  this  model. 

CONCLUSIONS. 

This  paper  has  described  a  new  image  pre-processor, 
consisting  of  a  sequence  of  identical  stages,  each  of  which  can 
perform  a  number  of  point  and  neighborhood  operations.  An 
important  feature  of  the  processor  is  the  provision  of  forward, 
recursive,  and  backward  paths  to  allow  image  data  to  participate 
in  temporal  as  well  as  spatial  neighborhood  operations.  The 
backward  pathway  also  allows  expectations  or  image  models  to  be 
inserted  into  the  system  by  the  host,  and  to  participate  in  the 
processing  in  the  same  way  as  images  acquired  from  the  input 
device.  The  region-of-interest  operator  is  also  a  powerful,  and 
unique,  feature  of  PIPE,  allowing  the  results  of  feature- 
extraction  processes  to  guide  further  image  analysis.  PIPE  also 
provides  a  multi-resolution  capability,  enabling  global  events  to 
be  made  local.  This  is  important  in  a  machine  that  has  only  local 
operators.  Much  research  needs  to  be  done  to  explore  the 
capabilities  of  the  device  but  early  experiments  indicate  that 
the  system  will  have  a  wide  range  of  applications  in  low-level 
real-time  image  processing. 
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Abstract:  A  method  is  presented  that  uses  a  diffusion-like  process  to  describe  the  shape  of  a  region.  Convexity  is  not  required, 
the  descriptor  is  invariant  under  several  common  transformations,  is  applicable  in  the  n -dimensional  case,  and  is  easy  to 
compute. 
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1.  Introduction 

Shape  description  is  an  essential  component 
of  any  image-understanding  system.  Many  ap¬ 
proaches  to  description  of  shape  have  been  propos¬ 
ed  and  used  in  the  fields  of  image  processing  and 
computer  vision.  Pavlidis  (1978)  suggested  a  tax¬ 
onomy  of  shape  descriptors  based  on:  (1)  whether 
just  the  boundary,  or  the  entire  interior  of  the  ob¬ 
ject  was  examined  (the  techniques  were  called  ex¬ 
ternal  and  internal,  respectively);  (2)  whether  the 
characterization  was  made  on  the  basis  of  a  scalar 
transform  (in  which  a  picture  is  transformed  into 
an  array  of  scalar  features),  or  a  space  transform 
(a  picture  is  transformed  into  another  picture);  and 
(3)  whether  the  procedure  is  or  is  not  information- 
preserving  in  the  sense  that  the  original  image  can 
be  reconstructed  from  the  shape  descriptors. 

Existing  methods  include  the  v'-s  curve,  in 
which  w  is  computed  as  the  angle  made  between  a 
fixed  line  and  a  tangent  to  the  boundary  of  the 
region;  it  is  plotted  against  s,  the  arc  length  of  the 
boundary  traversed.  For  a  closed  boundary,  the 
function  is  periodic,  and  may  be  associated  with 
segmentation  of  the  boundary  in  terms  of  straight 
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lines  and  circular  arcs  (Ballard  and  Brown,  1982). 
Other  methods  evaluate  eccentricity  (or  elongated¬ 
ness)  in  a  variety  of  ways,  including  length-to- 
width  ratio  and  ratio  of  the  principal  axes  of  in¬ 
ertia;  compactness  (e.g.,  perimeter2/ area,  and 
Danielsson’s  method  (Danielsson,  1979));  the 
slope-density  function,  which  is  a  histogram  of  w 
collected  over  the  boundary;  curvature,  the 
derivative  of  y  as  a  function  of  s;  projections  of 
the  figure  onto  an  axis  (the  signatures );  concavity 
with  a  tree  of  regions  that  will  create  the  convex 
hull  of  the  original  object;  shape  numbers  based  on 
chain-coding  of  the  boundary;  and  the  medial-axis 
transform,  which  transforms  the  original  object  to 
a  stick  figure  that  approximates  the  skeleton  of  the 
figure. 

We  present  here  a  new  shape  measure  that  allows 
rapid  assignment  of  labels  that  are  both  intuitively 
appealing  and  rigorously  based.  The  descriptor  can 
be  computed  easily  on  existing  hardware  and  may 
be  implemented  immediately  on  future  parallel- 
processing  systems.  Regions  need  not  be  convex 
(although  the  modest  requirement  is  imposed  that 
each  region  be  simply-connected;  i.e.,  have  a  single 
inside  and  a  single  outside).  This  is  a  significant  ad¬ 
vantage  in  light  of  the  comment  by  Pavlidis  (1978) 
that  there  exist  a  number  of  shape  description 
techniques  applicable  only  to  convex  objects,  while 
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some  of  the  general  ones  perform  much  better  if 
restricted  to  that  class.  The  method  described 
below  works  equally  well  for  convex  and  non- 
convex  regions.  Further,  the  approach  can  be  ex¬ 
tended  immediately  to  three-dimensional  objects. 

This  is  the  first  in  a  series  of  papers  examining 
the  behavior  of  a  diffusion-type  shape  descriptor. 
With  respect  to  the  taxonomy  noted  above,  it  is  in¬ 
ternal,  scalar,  and  non-information  preserving. 

2.  Method 

The  diffusion-type  procedure  simulates  the 
release  at  an  initial  time  of  a  given  number  of  par¬ 
ticles  from  each  pixel  along  the  boundary  of  a 
region  to  be  studied.  At  each  instant  of  discrete 
time  thereafter,  new  values  of  pixel  contents  are 
computed  based  on  an  assumed  diffusion  constant 
and  the  isotropic  assumption  (i.e,  that  the  diffu¬ 
sion  law  applies  equally  in  all  directions  for  all 
parts  of  the  region  under  study).  The  process  con¬ 
sists  of  an  initial  transient  and  a  subsequent  steady- 
state  condition.  In  steady-state  all  pixels  contain 
the  same  number  of  particles.  During  the  transient, 
however,  the  number  of  particles  in  each  boundary 
pixel  depends  upon  the  shape  of  the  boundary.  The 
concentration  is  greater  in  concavities  than  in  con¬ 
vexities,  with  straight  or  nearly-straight  regions 
having  intermediate  concentrations.  It  is  necessary 
therefore  to  stop  the  diffusion  process  during  the 
transient  to  detect  these  characteristics  of  the 
boundary.  When  the  simulated  diffusion  process  is 
stopped,  the  sequence  of  numbers  of  particles  in 
the  boundary  pixels  can  be  used  to  generate  a 
shape-related  code. 

This  approach  is  implemented  easily  on  digital 
computers  so  that  the  effects  of  changes  in  the 
following  relevant  process  parameters  can  be 
studied:  constant  of  diffusion,  stopping-time,  and 
initial  number  of  particles  per  pixel. 

Let  tVw(r)  be  the  number  of  particles  contained 
in  the  pixel  at  coordinates  (/,/)  at  time  t.  Then  the 
fundamental  algorithm  to  be  utilized  is: 

,Vi  y(r+  \)  =  N, 

+  K(N,  _,.,</)  +  N,  *,.,(/) 

+  W,.,*i<0  +  Ar„-i(0).  (l) 


Equation  (1)  expresses  the  requirement  that  the 
number  of  particles  in  a  given  pixel  of  the  image  at 
time  /  +  1  equals  the  number  of  particles  that  were 
there  at  t,  minus  the  number  of  particles  that  were 
transferred  by  the  assumed  diffusion  process  to  the 
(4-)neighboring  pixels,  plus  the  number  of  incom¬ 
ing  particles  from  those  same  neighbors,  based  on 
their  respective  contents  at  /.  Neighbors  that  lie 
outside  the  region  do  not  participate  in  the  process 
of  equation  (1). 

Though  in  this  preliminary  communication  we 
will  consider  only  the  two-dimensional  case,  the 
approach  can  be  generalized  easily  to  any  number 
of  dimensions.  In  the  three-dimensional  case  the 
basic  algorithmic  equation  is: 

+  N,'J_  |  .*(/)  +  A/j.y.  |,*(0 

+  N,.,.*-|(')  +  /V,y.„.  ,(/))•  (2) 

In  general,  for  the  n-dimensional  case,  if  we  call 
iV*|  jrj,  ,,„(/)  the  number  of  particles  contained  in 
the  pixel  x  at  coordinates  x,,jr2, ...  ,x„  at  time  r. 
then  the  following  equation  will  apply: 

NXl,Xl . *,(/+!) 

=  K,,xl . xSt)-2nKN,uXi_  ,.(/) 

+  K(NX[.l'X,'  +  . ,  (t ) 

+  Wjr,,xj  _| . )  +  tVXl ,r;  -  1.  .  ) 

+  -+  Af„.x,. . x.-i^  +  A',,.,..  .,.-iU». 

For  the  problem  that  we  are  considering  we  do 
not  need  to  adjust  the  parameter  K  to  experimental 
data  -  as  should  be  done  in  the  case  of  simulation 
of  a  real  diffusion  process  of  matter  or  heat.  So. 
for  purposes  of  computation  we  can  assign  to  K 
any  value  in  the  range  0<K<  1.  The  smaller  K  is, 
the  higher  will  be  the  degree  of  detail  of  the  results 
of  the  diffusion-type  process,  but  of  course  at  the 
expense  of  additional  computer  time.  The  tradeoff 
between  detail  and  accuracy  on  one  hand  and  com¬ 
puter  time  on  the  other  will  be  discussed  critically 
in  the  next  paper  of  this  series. 
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3.  Preliminary  results 

The  algorithm  was  tested  with  two  shapes:  a 
square,  and  an  irregular  region.  The  corresponding 
results  are  presented.  At  the  initial  time  (t  =  0), 
10000  particles  were  assigned  to  each  boundary 
pixel  for  each  of  the  two  cases.  In  the  case  of  the 
square,  the  number  of  particles  for  each  pixel  of 
the  image  has  been  computed  for  times  10,  50,  and 
100.  (See  Figures  la,  b,  and  c,  respectively.)  For 
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Figure  1(a).  Number  of  panicles  m  each  pixel  for  ihe  square 
region,  wuh  k  =  0  01  (a)  l  =  10. 
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the  three  cases  a  value  of  A"  =  0.01  was  used. 

Let  us  assume  that  we  establish  an  order  on  the 
boundary  of  those  squares  by  labeling  the  pixels 
with  consecutive  natural  numbers  following  a 
clockwise  direction,  starting  at  the  left  uppermost. 
In  Figures  2a,  b  and  c  the  number  of  particles  on 
the  boundary  pixels  has  been  expressed  as  a  func¬ 
tion  of  the  number  utilized  as  labels  for  the  pixels, 
for  each  of  the  corresponding  cases  of  Figure  1. 

In  the  case  of  the  irregular  shape,  only  the 
number  of  particles  of  each  pixel  on  the  boundary 
has  been  shown  for  times  3  and  10  in  Figures  3a 
and  b. 

For  such  irregular  shapes  the  boundary  pixels 
are  labeled  with  consecutive  integers,  again  pro¬ 
ceeding  clockwise  from  the  left  uppermost  pixel. 
Graphics  of  the  same  kind  as  Figure  2  appear  as 
Figure  4  for  the  irregular  shape. 

Both  for  the  case  of  the  regular  shape  (the 
square)  and  the  irregular  one,  it  can  be  seen  im¬ 
mediately  that  the  number  of  particles  per  pixel  - 
the  concentration  -  is  higher  in  concavities  than  in 
convexities. 

If  human  shape  perception  does  rely  heavily  on 
detection  of  curvature  maxima  (Attneave,  1954), 
then  the  (positive-  and  negative-going)  peaks  in  a 
plot  of  pixel  content-vs. -boundary  location  (cor¬ 
responding,  respectively,  to  segments  of  high  con- 
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Boundary  -  Pixel  Sequence  Number 

Figure  2.  Number  of  particles  for  consecutive  boundary  pixels  of  the  three  cases  of  Figure  3,  beginning  at  the  left  uppermost  (all  =  10. 

(b)  l  =  50;  (c)  r  =  500. 


cavity  and  convexity)  are  likely  to  be  useful  in  iden¬ 
tifying  those  important  regions. 

4.  Conclusions  and  perspectives 

We  have  presented  a  new  method  for  describing 
the  shape  of  a  region.  The  region  must  be  simply- 
connected,  but  need  not  be  convex.  The  descriptor 
is  invariant  under  translation,  rotations  by 


multiples  of  90°,  and,  it  appears,  under  scale 
changes.  It  is  almost  invariant  -  in  a  sense  that  will 
be  made  precise  in  the  next  paper  -  under  rotations 
of  non-multiples  of  90°.  It  can  be  implemented 
easily  in  hardware  (especially  on  future  parallel 
processors),  is  as  effective  in  higher  dimensions  as 
in  two,  and  will  lend  itself  easily  to  image- 
processing  and  pattern-recognition  applications. 
The  method  appears  to  be  relatively  insensitive  to 
noise  (cf.  the  medial-axis  transform,  in  which  \erv 
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Figure  3(a).  Number  of  particles  in  each  boundary  pixel  for  the  irregular  region  (interior  pixels'  values  omitted  for  clarity i.  Aith 

*  =  0.01,  r  =  3. 


large  changes  in  the  axis  are  produced  by  very 
small  changes  in  the  boundary  (Nevatia,  1982)), 
does  not  pose  problems  with  the  definition  of  slope 
(Rosenfeld  and  Johnston,  1973)  as  occurs  in  the 
curve  computation,  and  appears  to  be 
capable  of  dealing  with  the  matching  of  partially- 
occluded  shapes  (e.g.,  in  the  robot  vision  case), 
since  the  diffusion-produced  boundary  descriptors 
are  likely  to  be  less  affected  far  from  the  occluding 
boundary  and  thus  can  provide  the  basis  for  a  par¬ 
tial  match  to  a  pre-stored  description  of  a  complete 
boundary.  The  effects  of  noise  and  of  occlusion 
will  be  studied  together  in  a  forthcoming  paper. 

The  effect  of  changes  in  the  diffusion  constant, 
K ,  and  the  time  at  which  the  process  is  stopped  will 
be  examined  in  detail  in  future  papers.  A  method 
and  its  proof  are  needed  that  will  allow  for  in¬ 


variance  of  the  measure  under  scale  changes;  these 
results  are  expected  soon.  The  present  stopping 
criterion,  which  terminates  the  process  when  the 
difference  between  maximum  and  minimum  values 
along  the  boundary  is  maximized,  has  great  in¬ 
tuitive  appeal,  since  we  are  interested  in  distin¬ 
guishing  concavities  from  convexities  and  can 
think  of  this  as  a  sensitivity  measure. 

Alternatives  for  the  characterization  of  the 
results  of  the  diffusion-type  process  also  will  be 
considered  in  the  future.  Particle-count  plotted 
against  boundary  position  is  not  necessarily  the 
most  effective  representation.  Other  possibilities 
are:  (1)  to  normalize  the  count  by  subtracting  the 
mean  and  dividing  by  the  standard  deviation,  thus 
providing  for  the  possibility  of  establishing  data- 
independent  criteria  for  detecting  extrema;  and  (2) 
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to  use  the  first  difference  along  the  boundary  to 
detect  regions  of  small  and  of  large  change. 
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INTRODUCTION 

The  quadcode  is  a  hierarchical  data  structure  for  describing  digital 
images.  It  has  the  following  properties: 

Straightforward  representation  of  dimension,  size,  and  the 
relationship  between  an  image  and  its  subsets. 

Explicit  description  of  geometric  properties,  such  as  location, 
distance,  and  adjacency. 

-  Ease  of  conversion  from  and  to  raster  representation. 

The  quadcode  has  applications  to  computer  graphics  and  image  processing 
because  of  its  ability  to  focus  on  seleted  subsets  of  the  data  and  to 
allow  utilization  of  multiple  resolutions  in  different  parts  of  the 
image.  A  related  approach  is  the  quadtree.  Samet  recently  presented  a 
thorough  survey  of  the  literature  in  that  field  [1].  Gargantini  [2]  and 
Abel  [3]  presented  linear  quadtrees  and  linear  locational  keys  which  are 
efficient  labeling  techniques  for  quadtrees.  In  those  papers,  the 
geometric  concepts  of  the  image  are  discussed  by  using  the  tree  as  an 


*  *  S.  -*  *  rj  —w-  ^  - 


interpretive  medium,  and  the  approaches  and  procedures  are  based  on 
traversal ' of  the  nodes  in  'the  tree.  In  this  paper  we  present  the 
quadcode  system  which  is  a  direct  description  of  the  image,  and  discuss 
the  geometric  concepts  in  terms  of  the  coded  images. 


1 .  QUADCODE 


1 . 1  QUADCODE 

The  quadcode  is  a  quaternary  (base  4)  code.  A  quadcode  of  length  n  is 
of  the  form 


Q  - 

where  q^  *  0,1, 2, 3  for  i  ■  l,2,...,n. 

When  the  quadcode  is  used  in  describing  an  image,  each  character 
represents  one  operation  of  subdividing  the  image  or  its  subimage  into 
quadrants,  as  shown  and  labeled  in  Fig.  1. 


Fig.  1 


In  many  applications  (e.g.,  smoothing,  edge  detection,  shape  descrip¬ 
tion)  an  image  is  usually  subdivided  into  much  smaller  units,  so  the 
subdividing  operation  can  be  repeated  recursively  many  times,  until 
there  is  no  further  subdividing  needed.  Fig.  2  shows  the  subdividing 
process  and  the  corresponding  quadcodes.  A  particular  quadrant  is 
represented  by  one  of  0,  1,  2,  or  3,  concatenated  to  the  quadcode  of  its 
predecessor,  and  after  each  subdividing  operation  the  length  of  the 
quadcode  increases  by  one.  As  shown  in  the  figure,  the  quadcode  length 
signifies  how  many  subdividing  operations  have  been  done. 


Fig.  2 


1.2  PROPERTIES  OF  THE  QUADCODE 
Property  1  (qc  length) 

The  quadcode  length  of  individual  plxela  in  a  2n  by  2n  image  is  n. 

Property  2  (dimension) 


The  side  length  of  a  subimage  with  quadcode  length  m  in  a  2n  by  2n 
image  is  2n_m. 


n*  m  in  ij  u  \!*v*  m '.tfi  i*  i* '.» i*.u»  i*  .*  r».T»ji.w  i*y  H'.wtp."  >  '-V  ’± 1 


I 


\ 


subdividing 


merging 


Fig,  3  Subdividing  and  Merging 
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Property  3  (area) 

The  area  of  a  subimage  with  quadcode  length  m  in  an  image  of  area 


Property  4  (subdividing) 

Adding  one  character  to  a  given  quadcode  subdivides  the  given 
subimage  into  its  quadrants  (see  Fig.  3). 


Fig.  3 


Property  5  (merging) 

If  four  quadcodes  of  length  i  have  the  first  i-1  positions  the 
same,  they  represent  the  four  quadrants  of  the  same  subimage  which 
is  represented  by  the  ( i-l)-poaition  quadcode  (Fig.  3). 

Property  6  (locating) 

A  subimage  can  be  located  in  an  image  according  to  each  position  of 
its  quadcode,  which  is  equivalent  to  subdivision  of  the  initial 
image  (see  Fig.  4). 


Property  7  (cqnversion) 

Write  the  quadcode  and  array  labels  i,j  of  a  pixel  in  a  2n  by  2n 
image  (i.j  are  supposed  to  be  from  0  to  2n-l)  all  into  binary  form, 
it  can  be  proved  [4]  that 

qlV**qn  "  iljli2J2*-'1nJn 
or 

n 

i  -  (qfc  DIV  2)  *  2n"k 

k-1 

n 

j  -  ^  (qk  HOD  2)  *  2n"k 

k-1 
n 

Q  -  ^  (2ik  +  jk  )  *  4n"k 
k-1 

In  equations  (1),  (2),  and  (3),  q^,  i^,  (k-1 . n)  represent  the 

k-th  position  of  the  quadcode  and  array  labels  i.j,  respectively.  The 
equations  6how  that  the  array  labels  i.j  are  the  same  as  the  two  binary 
numbers  which  compose  the  odd  and  even  bits  of  the  quadcode  when  written 
in  binary.  This  is  represented  in  Table  1  where  the  two  axes  are  array 
labels  (or  coordinates)  i.j';  the  entries  give  the  corresponding 


quadcodes  in  both  binary  and  quaternary  forms. 


Table  1 


Jii2  1 

ilJlili2  1  00 

1li2  <51*12  1 

01 

10 

11 

|  0000 

0001 

0100 

0101 

1 

00  I 

1 

1 

|  0010 

1 

0011 

0110 

0111 

01  1 

1 

1 

|  1000 

1 

1001 

1100 

1101 

10  I 

1 

1 

|  1010 

1011 

1110 

mi 

11 


2.  ANALYSIS  OF  ELEMENTARY  SQUARES 


2 . 1  ELEMENTARY  SQUARES 

When  a  quadcode  can  represent  a  subset  of  an  image,  we  call  the  subset 
an  elementary  square. 


For  example,  in  Fig.  5  there  are  two  elementary  squares  and  their 
quadcodes  are  A  ■  12  and  B  »  221.  In  general,  an  elementary  square  in 

a  2n  by  2n  image  can  be  represented  by  q  ...q  where  m<n. 

The  size  of  an  elementary  square  with  quadcode  length  m  is  2n-m  by  2n-m. 
For  instance,  in  Fig.  5  the  size  of  elementary  square  A  is  by  2^~^ , 

i.e.,  2  by  2,  and  the  size  of  B  is  by  2^-^,  i.e.,  1  by  1. 


Fig.  5 


2.2  LOCATION 

If  we  know  the  size  of  a  given  elementary  square,  we  can  locate  it  by 
the  coordinates  of  one  of  the  characteristic  points  in  the  square.  We 
choose  the  upper  left  corner  point  as  the  characteristic  point.  Then, 
from  Eq.  (2) , 


Fig.  5  Location  and  Distance 


A  and  B,  and  Sjj  is  the  vertical  side  length  of  the  uppermost  of  the 
elementary  square.  Because  the  quadcodes  of  A,  B  give  their  dimensions 
and  their  horizontal  and  vertical  locations,  both  Sl  and  Sy  can  be 
obtained  from  their  quadcodes. 

For  example,  in  Fig.  S  A  •  12,  B  * 
is  B,  and  the  uppermost  is  A. 

Then, 

x(A)  ■  4 
x(B)  -  1 

SL  -  SB  -  2 3-3  -  1 
Hence,  the  distance  between  A,  B  is 

Dx  (A.B)  «  | 1-4 |  -  1  -  2 
Dy  (A.B)  -  ( 6-2 |  -  2  -  2 

as  shown  in  Fig.  5. 

2 . 4  ADJACENCY 

The  detection  of  connectivity  is  a  basic  and  important  subject  in  image 
analysis.  The  general  condition  of  adjacency  for  elementary  squares  is 
the  following: 


221;  the  leftmost  elementary  square 

y(A)  -  2 
y(B)  •  6 

Su  -  SA  -  23-2  -  2 


Dx( A, B)  •  Dy(A,B)  -  0 
Dv( A.B)  ♦  Dv( A, B)  <  0 


(6) 


*  w.”*  ^  -V  • 


\  -  .  '  -  -  v 


x  -  Jt  (qk  MOD  2)  *  2 


S 

k-1 

a 

y  -  (qk  DIV  2)  *  2 


(4) 


k-1 

Por  example,  the  locations  of  the  squares  A  -  12,  B 

2 

x(A)  -  (q^  MOD  2)  *  23_k  -  4 

k-1 

2 


-  221  In  Fig.  5  are 


y(A)  -  ^  (q^  DIV  2)  *  23_k  -  2 

k-1 


and 


x(B)  -  jT  (qgk  MOD  2)  *  23"k  -  1 
k-1 


■J 

y(B)  -  (qBk  DIV  2)  *  2J"  -  6 

k-1 


,3-k 


2.3  DISTANCE 

The  distance  between  two  elementary  squares  A  and  B  is  composed  of  the 
smallest  distances  between  any  points  on  the  boundaries  of  the  two 
elementary  squares  in  both  horizontal  and  vertical  directions,  and  they 
are  calculated  by 


D  (A,B)  -  |x(B)  -  x(A) |  -  S. 

X  L 

Dy  (A.B)  -  | y( B)  -  y( A) |  -  Sy 


where  S  is  the  horizontal  side  length  of  the  leftmost  of  the  two 


elementary  squares 


[ST*] 


If  DX(A,B)  ■  0,  then  Dy(A,B)  '<  0. 

According  to  the  first  equation  of  (5),  DX(A,B)  ■  0  means  that  the  right 
side  of  the  far  left  square  is  co-linear  with  the  left  side  of  the  far 
right  square.  And  according  to  the  second  equation  of  (5)  Dy(A,B)  £  0 
means  that  either  the  upper  left  corner  of  the  far  right  square  is  on 
the  right  side  of  the  far  left  square  (see  Fig.  6(a)),  or  the  upper 
right  corner  of  the  far  left  square  is  on  the  left  side  of  the  far  right 
square  (see  Fig.  6(b)).  If  Dy(A,B)  ■  0,  then  DX(A,B)  £  0  and  the  proof 
is  the  same. 


In  Fig.  7  there  are  three  elementary  squares  A  ■  1,  B  •  221,  C  ■  310. 
According  to  equation  (5),  the  distances  among  them  are 

DX(A,B)  -  2  DX(A,C)  -  -2  DX(B,C)  -  4 

Dy(A, B)  -  2  Dy(A,C)  -  0  Dy(B,C)  -  1 

so,  DX(A,B)  •  Dy( A, B)  -  4  ,  DX(B,C)  •  Dy(B,C)  *  4 

and  Dx ( A , C )  •  Dy(A,C)  -  0  ,  DX(A,C)  +  Dy(A,C)  -  -2 

In  view  of  Eqs  (6),  we  observe  that  elementary  squares  A,  C  are  adja¬ 
cent,  but  A,  B  and  B,  C  are  not,  as  shown  in  Fig.  7. 


Equation  (6)  is  the  general  adjacency  condition  for  elementary  squares 
of  any  size.  For  the  connectivity  detection  of  pixels,  Sy  ■  Sy  •  1,  the 
distance  equation  (5)  becomes 


fc  V 


* 

z 


px  (pj.P2).  *  U(P2)  “  *<?!>!  *  1 

Dy  ^1^2*  "  I y(p2)  ■  y<Pj>l  *  1 


and  adjacency  condition  (6)  becomes 

max  (|x(p2)  -  x(p1)|,  |y(p2)  ”  y(p1)l)  *  1  (8) 

or  |x(p2>  -  x(pj)(  +  |y(p2>  -  y(Pj)|  •  1 

Equations  (5)  and  (6)  can  be  extended  to  detect  the  connectivity  of 
elongated  regions,  if  we  know  their  locations  (still  represented  by  the 
coordinates  of  their  upper  left  corner  points)  and  the  lengths  of  their 
sides. 


Fig.  7 


3.  REGION  REPRESENTATION  AND  ANALYSIS 


3.1  REGION  REPRESENTATION 

In  quadcode  representation,  a  region  is  represented  as  a  set  of  quad- 
codes 


R  -  <Qn>  n  -  1.....N 


W  .*•  »'*  ♦’Vs  •»  *  •  ***  •'  ■  , »  .  . 


where  each  element  represents  an  elementary  square  In  region  R,  and  N  is 
the  total  number  of  those  elementary  squares  in  region  R. 


For  example,  the  region  in  Fig.  8  can  be  represented  as 


R  -  {03,120,121,21,230,231, 300, 301 , 320, 321} 


Fig.  8 


3.2  SET  ARITHMETIC  FOR  QUADC0DES 

Suppose  A,  B  are  the  quadcodes  of  two  elementary  squares 


A  ”  ai . ajc 


B 

and  k  <  m 


b 

m 


THEOREM  1 


The  intersection  of  two  such  quadcodes  is  either  empty  or  the 
longer  quadcode: 


.a  n  bv 

k<m 


m 


r- 

'0 


if  a| *  *  *  *  * ak  "  bl 


(10) 


otherwise 


Proof 


(1)  a 


1 . ak  "  bl . k 

B  -  b, . b  -  a. . a.  b,  . b_  C  a 


'1 


1 


k  k+r 


r 


.a,  •  A 
k 


-12- 


.  L.V-.  L  .V.Vv'-t  I-  -L 


According  to  the  definition  of  the  quadcode, 

*1 . *k  n  bi . bk  ‘  0  bi . b„  c  br 


Q.E.D. 


THEOREM  2 


The  union  of  two  such  quadcodes  is  either  then  concatenation  or  the 


shorter  quadcode: 


.  p . ak  “  *1 . 

{a^ .... .a^ibj .... ,b^} 


“  *Y 

k<m 


>ak  -  bl . bk 

otherwise 


The  proof  is  as  the  above. 


The  laws  of  set  arithmetic  are  true  for  quadcode  seta  too,  and  they 
include  the  commutative,  associative,  and  distributive  laws,  as  well  as 
de  Morgan '9  law. 


3.3  UNION  ANB  INTERSECTION 


DEFINITION 


The  UNION  of  two  regions  is  represented  by  the  union  of  the  sets  of  the 
two  regions. 


For  example,  in  Fig.  9  there  are  two  regions 


A  -  {12.  211,  31  and  B  -  {03,  1,  21.  310} 


The  union  of  A, 'B,  according  to  (iO)  is 


A  U  B  -  {12,  211.  3}  U  {03,  1.  21.  310V 

-  {03.  1  U  12,  21  U  211.  3  U  310} 

-  {03,  1,  21.  3} 


DEFINITION 


The  INTERSECTION  of  two  regions  is  represented  by  the  intersection  of 
the  sets  of  the  two  regions. 


According  to  (9), 

A  n  B  -  {12.  211,  3}  n  {03,  1,  21,  310} 
-{in  12.  21  n  211,  3  n  310} 

-  {12,  211,  310} 


as  shown  in  Fig.  9. 


CONCLUSIONS 


In  the  paper  we  introduced  the  quadcode  system,  discussed  its  geometric 
properties,  analyzed  the  geometric  concepts  of  elementary  squares,  and 
presented  the  quadcode  set  representation  of  regions.  From  the  dis¬ 
cussion  we  see  that  the  information-compact  and  intrinsic  hierarchical 
characteristics  of  the  quadcode  benefits  the  representations  and 
analysis  by  making  them  more  intuitive  and  more  easily  computed.  The 
quadcode  can  also  be  used  in  a  tree-structured  representation.  In 
earlier  work  (5]  and  [6]  we  discussed  the  quadcode-labeled  tree  (the 
qc-tree)  ,  discussed  the  storage  efficiency  [5],  and  the  applications  of 
the  qc-tree,  such  as  boundary  search  [6].  Instead  of  using  a  tree  as 
the  interpretive  medium,  transferring  a  image  into  the  tree,  performing 
the  processing  by  traverse  and  search  of  the  tree,  and  then-  transferring 
back  to  the  image,  this  paper  discussed  opportunities  and  methods  for 
analysis  directly  on  the  coded  image .  In  a  companion  paper  [7],  we 
discuss  adjacency  detection  and  perform  all  the  computation  in  terms  of 
quadcodes . 

The  properties  and  the  results  presented  here  for  2-D  images  can  be 
extended  to  3-D  images,  where  the  octcode  combines  the  coordinates  of 
three  dimensions  into  one  code.  In  binary  form,  each  position  of  the 
octcode  is  composed  of,  respectively,  z,  y.  and  x  coordinate 

°i  *  ziy±xt 

and  the  so-constructed  octcode  is  also  an  intrinsically  hierarchical 


coding  system. 
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INTRODUCTION 

A  method  is  presented  for  determining  whether  two  given  regions  are 
adjacent,  and  for  finding  all  the  neighbors  of  different  sizes  for  a 
given  region.  Regions  are  defined  as  elementary  squares  of  any  size.  In 
a  previous  paper  [1]  we  introduced  the  quadcode  and  discussed  its  use  in 
representing  geometric  concepts  in  the  coded  image,  such  as  location, 
distance,  and  adjacency.  In  this  paper  we  give  a  further  discussion  of 
adjacency  in  terms  of  quadcodea.  Gargantini  [2]  discussed  adjacency 
detection  using  linear  quadtrees.  Her  discussion  was  applied  to  pixels 
and  a  procedure  was  given  to  find  only  a  pixel's  southern  neighbor. 
This  paper  considers  elementary  squares  of  any  size,  and  gives  proce¬ 
dures  for  both  aspects  of  the  problem:  to  determine  whether  two  given 
regions  are  adjacent,  and  also  to  find  all  the  neighbors  of  different 
sizes  for  a  given  region. 


1.  ADJACENCY  DETECTION 

1.1  Adjacency  for  Elementary  Squares  of  Same  Size 

An  elementary  square  is  a  square  region  which  can  be  represented  by  a 
quadcode.  Suppose  two  elementary  squares  A,  B  in  a  2n  by  2n  image  are 
the  same  size  and  their  quadcodes  are  of  length  m 


A  - 


V 


.  a 

m 


B  - 


b 


r 


.b 

m 


and 


m  <  n . 


We  use  bit  operations  and  without  loss  of  generality  suppose  d^  is  the 

V  '•  t 

leftmost  nonzero  difference,  i.e., 

b^  -  a^  -  0  for  i<k 

\  -  \  ’  \  * 0 


THEOREM 

For  A,  B  to  be  adjacent  (4-neighbor),  they  should  satis  fy 

bi  “  ai  "  "  \  (1) 

for  all  k  <  i  <  m  and  d  •  ±1  or  ±2,  and  where  the  a^  and  are 

terms  of  the  quadcode  defined  above. 


PROOF 

We  give  the  proof  in  the  case  m  ■  n;  the  proof  for  other  cases  will  be 
similar.  When  m  »  n,  A  and  B  are  pixels  and  the  adjacency  condition 
[1]  Lb 


|x(B)  -  x(A) |  ♦  | y(B)  -  y( A) |  ■  1 
i.e.,  (i)  x(B)  -  x(A)  •  il  and  y(B)  -  y(A)  -  0 

or  (ii)  x(B)  -  x(A)  *  0  and  y(B)  -  y(A)  ■  il 


In  case  (i).  A,  B  are  horizontally  adjacent,  Ax(A,B)  ■  il,  Ay(A,B)  »  0 


and  d,  •  il 
k 


b,...b  —  a, ...  a 

lnln 


•  <bk  -  V2"'k  * 


L  <bi  -  *i>  •  2"'1 


i-k+1 


"V  2 


i  • 

n'k  +  ^  (bi  -  a^  •  2n_i 


i-k+1 


-2- 


Hence,  b^-  a^  ■  2^-d^ 

*  “dk 

Q.E.D. 

For  example,  in  Fig.  1(a)  there 

3  3 

is  a  pixel  A  •  123  in  a  2  by  2  image 

Eastern  neighbor 

E  «  132 

,  -  d2  -  1  and  d^  -  -1 

Western  neighbor 

W  -  122 

,  ^  -  d3  -  -1 

Southern  neighbor 

S  -  301 

,  d^  ■  d^  -  2  and  d^  ■  d^  ■  -2 

Northern  neighbor 

N  -  121 

’  dk-d3--2 

In  Fig.  1(b)  there  is  a  pixel  A 

4  4 

■  2111  in  a  2  by  2  image. 

(1)  E  -  3000  , 

dk  ■  di  ■ 

1  and  d2  ■  dj  *  d^  ■  -1 

(2)  W  -  2110  , 

dk  '  d4  ' 

-1 

(3)  S  -  2113  , 

d  ■  d  ■ 

k  4 

2 

(4)  N  -  0333  , 

d,.  -  d  - 

-2  and  d_  »  d_  “  d,  -  2 

1.2  Adjacency  for  Elementary  Squares  of  Different  Sizes 


Suppose  two  elementary  squares  are  represented  by  quadcodes 

A  ■  a, ...  a 

1  m 

B  *  b, . . .b 
1  r 

where 

m  >  r 

THEOREM 

If  the  two  elementary  squares  are  adjacent,  they  should  satisfy  the 
conditions 


(i) 

V" 

.a^,  b^...br  satisfy  condition  (1) 

(ii) 

if 

dk  “  bk  -  "k  “  1 

then 

aA  MOD  2-1  for  i  >  r 

if 

dk  -  -1 

We  use  Fig.  2  to  explain  wl>at  the  conditions  in  (3)  mean.  The  numbers 

•*  «  • 

in  the  middle  grid  represent  the  quadcode  of  the  larger  square  B  » 
b}...br  which  was  supposed  to  be  0,  1,  2  or  3.  The  numbers  around 
represent  the  quadcodes  of  the  smaller  square  A  ■  a^...am,  whose 
adjacency  we  wish  to  test.  In  Fig.  2,  we  suppose  m  -  r  »  3  and  list  the 
lower  positions  of  the  quadcodes  of  A,  i.e.,  a,.^..^.  For  instance, 
elementary  square  2  (or  3)  has  northern  neighbors  0222,  0223,  0232, 
0233,  0322,  0323,  0332,  and  0333  (or  1222,  1223,  1232,  1233,  1322,  1323, 
1  332  ,  and  1333),  and  elementary  square  0  (or  2)  has  eastern  neighbors 
1000,  1002,  1020,  1022,  1200,  1202,  1220,  and  1222  (or  3000,  3002,  3020, 
3022,  3200,  3202,  3220,  and  3222). 

Fig.  2 

2.  FINDING  NEIGHBORS 

2.1  The  Pixel  Neighbors  of  a  Pixel 

Suppose  a  pixel  in  a  2n  by  2n  image  has  quadcode  as  A  “  a^.-.a^,  and 
its  eastern,  western,  southern  and  northern  neighbor  pixels  have 

quadcodes  E  ■  e,...e  ,  W  *  w, . . .w  ,  S  “  s,...s  ,  and  N  ■  n. . . .n  ,  res-’ 
i  n  in  in  in 

pectively. 

The  following  is  an  algorithm  for  finding  the  quadcodes  of  neighbors. 

A.  Eastern  neighbor 

Suppose  j  •  max  (i)  such  that  a  ■  0  or  2. 


If  J  *  0  «  then  stop.  It  means  that  A  does  not  have  an  eastern 

V  ‘  *  i  * 

neighbor.  Otherwise,  we  can  construct  the  neighbor's  quadcode 
character  by  character,  using 


i  >  J 
i  “  j 
i  <  j 


FOR  1  ■  1 , . . .  ,n 
Examples  are  shown  in  Table  1 . 


(4) 


Table  1 


A 

012 

023 

213 

113 

E 

013 

032 

302 

none 

Western  neighbor 

Suppose  k  -  max  { i >  such  that  a^  ■  1  or  3. 

If  k  ■  0,  it  means  that  A  does  not  have  a  western  neighbor. 
Otherwise , 

1  i  >  k 

1  i  -  k  (5) 

i  <  k 

Examples  are  shown  in  Table  2. 


Table  2 


A 

113 

W 

112 

20.  | 

302 


202 


-7- 


213 


none 


Southern  neighbor 

Suppose  JL  ■  sax  {i}  such  that  a^  ■  0  or  1 . 

If  I  •  0,  it  means  that  A  does  not  have  a  southern  neighbor 
Otherwise, 


i  >  1 
i  -  1 
i  <  1 


Examples  are  shown  in  Table  3 


(6) 


Table  3 


A 

301 

213 

123 

223 

S 

303 

231 

L 

1  o 

i  ►- 

i 

i 

none 

Northern  neighbor 

Suppose  m  «  max  <i}  such  that  a^  •  2  or  3. 

If  m  »  0,  it  means  that  A  does  not  have  a  northern  neighbor 
Otherwise, 

!a  ♦  2  i  >  m 

aA  ~  2  i  -  m  (7) 

i  <  m 

Examples  are  shown  in  Table  4. 


Table  4 


A 

213 

2  30 

311 

101 

- —  — 

— 

N 


133 


none 


£.  The  neighbors  with  the  same  size  of  an  elementary  square 

The  adjacency  condition  in  A,  B,  C,  D  can  be  used  to  find  the  same- 

size  neighbors  of  an  elementary  square  a  ■  a.... a  (m  <  n),  just  by 

in  — 

changing  n  to  m. 

2.2  The  Neighbors  with  Smaller  Size 

To  find  the  neighbors  of  an  elementary  square  A  ■  a. ...a  which  are 

1  B 

of  smaller  size  than  A.  we  use  a  two-step  algorithm. 

A.  Using  the  algorithms  in  Sec.  2.1  to  find  the  same-size 

neighbors  of  A,  the  results  are  represented  as  e, ...e  ,  w, ...w  , 

1  m  1  m 

s .... s  ,  and  n , . . . n  . 

1  m  1  m 

B.  Add  positions  to  each  of  those  same-size  neighbors 
starting  from  m  +  1  until  at  most  according  to  the  following  rules: 

-  0,2 


•±  -  0,1 

ni  ■  2.3 


where  i  »  m  +  l,...r  and  r  <  n. 


Example  Is  shown  in  Fig.  3  and  Table  5,  which  respectively  show  and  list 


all  the  neighbors  equal  to  or  smaller  than  the  elementary  square  A  *  12 
(n  ■  4) . 


Table  5 


Equal-or-smaller  Neighbors  of  A  »  12  (n  *  4) 


Direction 

Size 

Eastern 

Western 

Southern 

Northern 

4x4 

13 

03 

30 

10 

130 

031 

300 

102 

2x2 

132 

033 

301 

103 

1300 

0311 

3000 

1022 

1302 

0313 

3001 

1023 

lxl 

1320 

0331 

3010 

1032 

i _ 

1322 

0333 

3011 

1033 

2.3  The  Neighbors  with  Greater  Size 

To  find  the  neighbors  of  an  elementary  square  A  »  a,... a  which  are 

1  in 

larger  than  A,  the  algorithm  is  as  follows: 

A.  Using  the  algorithms  in  Sec.  2.1  to  find  the  same-size 

neighbors  of  A,  the  results  are  represented  as  e....e  ,  w,...w  , 

r  1  m  1  m 


-in- 


B.  Delete  the  positions  from  the  rear  one  by  one  until 


meeting  the  position -of  the  same  value  with  a,. 

Example  is  shown  in  Fig.  4  and  Table  6,  which  lists  all  the  neighbors 
equal  to  or  larger  than  the  elementary  square  A  ■  2133  (n  ■  4). 

Table  6 

Equal-or-larger  Neighbors  of  A  ■  2133  (n  *  4) 


Direction 

Size 

Eastern 

Western 

Southern 

Northern 

2132 


Table  7 


Conclusions 


Constructive  methods  are  presented  for  finding  neighbors  of  a 
region  -  regardless  of  size  or  direction  -  using  quadcodes.  The  methods 
are  straightforward  and  simple  because  they  make  efficient  use  of  the 
position  and  size  information  implicit  in  the  quadcode.  The  generality 
of  the  approach  makes  it  a  good  candidate  for  parallel  implementation. 
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